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1  Executive  Summary  (ABSTRACT) 


The  project  consists  of  two  major  thrusts. 

Robust  inference  for  sensor  networks  The  first  thrust  is  on  robust  inference  and  connectivity  for  sensor  and 
ad  hoc  networks  involving  decentralized  sensor  nodes  with  unreliable  communication  channels.  The 
effort  spans  FY05-FY07.  We  have  investigated  two  major  problems  under  this  thrust: 

1.  Decentralized  signal  processing  for  statistical  inference  when  the  communication  between  the 
sensors  and  t  he  fusion  center  is  subject  to  channel  outage.  Major  findings  of  the  work  were 
published  in  the  following  paper: 

•  Y.  Lin,  B.  Chen,  and  B.  Suter,  “Robust  binary  quantizers  for  detection  in  sensor  networks,” 
IEEE  Trans.  Wireless  Communications,  vol.  6,  pp.2172-2181,  June  2007. 

In  t  he  paper ,  t  he  m  ultiple  des  cription  pr  inciple  w  as  adopt  ed  t  o  pr  ovide  r  obust  i  nference 
performance  at  the  fusion  center  in  the  event  that  only  a  subset  of  the  sensors  successfully 
send  t  heir  output  t  o  t  he  fusion  c  enter.  1 1  w  as  f  ound  t  hat  pr  oactively  de  signing  I  ocal  s  ensor 
processing  provides  s  ignificant  performance  gain  over  the  approach  that  a  II  s  ensor  o  utputs 
were  presumed  reliably  available  at  the  fusion  center  when  channel  outage  occurs. 

2.  Cooperative  relay  that  minimizes  the  error  probability  at  the  destination.  Recognizing  the 
equivalence  between  cooperative  relay  with  finite  alphabet  sources  and  decentralized  hypoth¬ 
esis  testing,  we  have  developed  a  new  framework  for  relay  processing  design  that  a  ims  to 
optimize  the  performance  at  the  destination  node  in  terms  of  error  probability.  Major  findings  of 
the  work  were  published  in  the  following  paper: 

•  B.  Liu,  B.  Chen,  and  R.S.  Blum,  “Minimum  error  probability  cooperative  relay  design,”  IEEE 
Trans.  Signal  Processing,  vol.  6,  pp.  2172-2181,  June  2007. 

Throughput  study  of  multi-user  and  free  space  MIMO  communications  The  second  thrust  study  throughput 
issues  for  MIMO  communications  under  different  scenarios.  The  first  scenario  is  when  multiple  MIMO 
transmitters  communication  with  a  single  MIMO  receiver  and  we  study 
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the  sum-rate  optimality  of  orthogonal  transmission  for  such  systems.  Major  findings  of  the  work 
were  published  in 

•  X.  Shang,  B.  Chen,  and  J.  Matyjas,  "Sum  capacity  optimality  of  orthogonal  communications 
over  vector  Gaussian  multiple  access  channels,  IEEE  Trans.  Wireless  communications,  vol. 
7,  pp.  4304-4311,  November  2008. 

Sufficient  conditions  and  necessary  conditions,  in  terms  of  channel  matrices  and  transmitter  power 
constraints,  for  orthogonal  transmissions  to  achieve  the  sum  capacity  of  a  vector  Gaussian  MAC 
were  obtained.  The  obtained  conditions  provide  a  unified  framework  that  helps  explain  many 
intuitive  and  known  results  as  well  as  explore  cases  that  have  not  been  addressed.  In  the  cases 
when  these  conditions  are  violated,  the  developed  results  enable  us  to  quantify  the  suboptimality  of 
orthogonal  transmission  when  the  sum  capacity  can  only  be  achieved  by  overlay  transmission.  The 
second  scenario  concerns  MIMO  communication  with  airborne  platforms,  i.e.,  free  space  MIMO 
communication  when  there  is  a  lack  of  scattering  in  the  transmission  medium.  Our  primary  effort 
for  this  problem  involves  the  development  of  a  GUI  software  system  that  studies  the  theoretical 
throughput  under  realistic  channel  conditions  in  terms  of  antenna  size/spacing,  platform  velocity, 
and  power  constraints.  The  developed  software  allows  us  to  study  throughput  of  MIMO  peer- 
to-peer  communications  under  various  airborne  network  configuration.  One  of  the  major  findings 
is  that  for  most  tested  platform  trajectories,  the  impact  of  interference  is  rather  limited  even 
when  the  receivers  completely  ignore  the  interference,  i.e.,  treating  interference  as  noise.  The 
primary  reason  is  that  the  spatial  diversity  as  afforded  by  the  large  antenna  aperture  (instead 
of  scattering  for  terrestrial  channels)  gives  rise  to  the  immunity  of  multi-user  interference  for 
concurrent  transmissions. 

The  rest  of  the  final  report  will  primarily  involve  results  related  to  the  development  of  the  GUI 
software.  For  the  other  three  problems  investigated  under  this  effort,  we  have  included  the  three 
archival  papers  cited  above  as  Appendix  for  this  final  report  as  the  results. 
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2  Introduction 


The  project  consists  of  two  thrusts:  1)  robust  inference  in  sensor  and  ad  hoc  networks;  2)  throughput 
analysis  of  multiuser  MIMO  systems.  This  chapter  provides  a  synopsis  of  the  major  contributions  under 
this  effort.  Many  of  the  research  results  have  been  reported  in  archival  papers  and  have  been  widely 
disseminated  to  the  research  community.  Thus  we  will  briefly  summarize  those  research  results  and 
leave  the  details  to  the  archival  papers  which  are  attached  as  appendix  of  this  report.  Instead,  we  will 
describe  in  details  on  the  development  of  the  GUI  system  for  throughput  study  of  free  space  MIMO 
communications  in  the  next  chapter. 

2.1  Robust  inference  in  sensor  and  ad  hoc  networks 

For  the  emerging  wireless  sensor  networks  (WSN),  distributed  signal  processing  design  has  to  deal  with 
various  physical  limitations  imposed  by  severe  resource  constraints.  For  example,  the  power  and 
bandwidth  constraints,  coupled  with  the  interference  and  channel  fading,  may  result  in  transmission 
loss  due  to  channel  outage  [1].  In  addition,  low  cost  sensor  nodes  deployed  in  harsh  environments  may 
be  subject  to  sensor  failure,  making  them  unavailable  for  sensing/  communication  [2]. 

Our  work  studied  robust  signal  processing  techniques  for  inference-centric  distributed  sensor  networks 
operating  in  the  presence  of  possible  sensor  and/or  communication  failures.  Motivated  by  the  multiple 
description  (MD)  principle  [3,4],  we  develop  robust  distributed  quantization  schemes  for  a  decentralized 
detection  system.  Specifically,  focusing  on  a  two-sensor  system,  our  design  criterion  mirrors  that  of  MD 
principle:  if  one  of  the  two  transmissions  fails,  we  can  guarantee  an  acceptable  performance,  while 
enhanced  performance  can  be  achieved  if  both  transmissions  are  successful.  Different  from  the 
conventional  MD  problem  is  the  distributed  nature  of  the  problem  as  well  as  the  use  of  error  probability 
as  the  performance  measure.  Two  different  optimization  criteria  are  used  in  the  distributed  quantizer 
design,  the  first  a  constrained  optimization  problem,  and  the  second  using  an  erasure  channel  model. 
We  demonstrate  that  these  two  formulations  are  intrinsically  related  to  each  other.  Further,  using  a 
person-by-person  optimization  approach,  we  propose  an  iterative  algorithm  to  find  the  optimal  local 
quantization  thresholds.  A  design  example  is  provided  to  illustrate  the  validity  of  the  iterative  algorithm 
and  the  improved  robustness  compared  to  the  classical  distributed  detection  approach  that  disregards 
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the  possible  transmission  losses. 

Technical  details  can  be  found  [5]  which  is  attached  in  the  appendix  of  this  final  report. 

2.2  Minimum  error  probability  cooperative  relay  design 

In  wireless  networks,  a  severe  limiting  factor  is  multipath-  induced  channel  fading.  One  of  the  most 
effective  methods  in  mitigating  fading  is  to  exploit  diversity  [6].  Examples  include  spatial  diversity 
when  multiple  antennas  are  used  at  the  transceivers,  multipath  diversity  in  frequency-selective  channels, 
and  temporal  diversity  in  time-selective  fading  channels  through  the  use  of  coding/interleaving.  More 
recently,  a  new  diversity  resource  has  attracted  considerable  attention,  especially  in  the  context  of 
wireless  ad  hoc  networks  [7-9].  There,  multiple  nodes  collaborate  in  transmitting  their  information, 
thus  providing  diversity  by  exploiting  the  independence  of  the  fading  channels  of  different  users.  This  is 
generally  referred  to  as  the  cooperative  diversity,  and  the  collection  of  cooperating  nodes,  including  the 
source  and  the  destination  nodes,  are  referred  to  as  a  relay  network. 

Recognizing  the  connection  between  cooperative  relay  with  finite  alphabet  sources  and  the  distributed 
detection  problem,  our  effort  studies  relay  signaling  design  via  channel  aware  distributed  detection  theory. 
Focusing  on  a  wireless  relay  network  composed  of  a  single  sourcedestination  pair  with  relay  nodes,  we 
derive  the  necessary  conditions  for  optimal  relay  signaling  that  minimizes  the  error  probability  at  the 
destination  node.  The  derived  conditions  are  person-by-person  optimal:  each  local  relay  rule  is  optimized 
by  assuming  fixed  relay  rules  at  all  other  relay  nodes  and  fixed  decoding  rule  at  the  destination  node. 
An  iterative  algorithm  is  proposed  for  finding  a  set  of  relay  signaling  approaches  that  are  simultaneously 
person-by-person  optimal.  Numerical  examples  indicate  that  the  proposed  scheme  provides  performance 
improvement  over  the  two  existing  cooperative  relay  strategies,  namely  amplify- forward  and  decode¬ 
forward. 

Technical  details  can  be  found  in  [10],  also  attached  in  the  appendix  of  this  final  report. 

2.3  Throughput  Optimality  of  Orthogonal  Transmissions  for  MIMO 
Multiple  Access  Channels 

It  is  well  known  that,  for  a  scalar  Gaussian  MAC,  orthogonal  transmissions,  e.g.,  frequency  division 
multiple  access  (FDMA)  or  time  division  multiple  access  (TDMA)  under  an  average  power  constraint, 
can  achieve  the  sum  capacity  [11].  As  such,  although  FDMA  and  TDMA  is  suboptimal  in  terms  of 
the  entire  capacity  region,  if  only  the  system  throughput  is  of  concern,  orthogonal  transmissions  are 
sufficient,  resulting  in  a  much  simplified  transceiver  structure,  i.e.,  no  successive  interference  cancellation 
is  needed.  Similar  result  holds  for  a  scalar  Gaussian  MAC  with  more  than  two  users.  With  vector  Gaussian 
MAC,  the  above  claim  -  that  orthogonal  transmissions  achieve  the  sum  capacity  -  is  not  necessarily  true. 
Indeed,  it  is  observed  that  in  most  cases  orthogonal  transmissions  fall  well  short  of  achieving  the  sum 
capacity  of  a  vector  Gaussian  MAC  [12]. 

The  goal  of  this  study  is  twofold.  First,  we  establish  sufficient  and  necessary  conditions  for  orthog- 
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onal  transmissions  to  be  optimal  in  achievable  sum  rate  for  a  vector  Gaussian  MAC.  The  established 
conditions,  in  terms  of  singular  values  and  singular  vectors  of  the  channel  matrices  as  well  as  the  power 
constraints,  provide  a  unified  framework  behind  many  intuitive  and  well  known  results.  In  addition,  it 
allows  us  to  examine  cases  that  have  not  been  explored  before  in  terms  of  the  (sub)optimality  of  orthog¬ 
onal  transmissions  for  vector  Gaussian  MAC.  We  show  that  the  channel  must  have  proportional  singular 
values,  well  aligned  singular  vectors  and  appropriate  power  constraints  in  order  for  FDMA/TDMA  to 
achieve  the  sum  capacity.  Secondly,  using  the  established  conditions,  we  attempt  to  provide  quantitative 
measure  for  the  performance  degradation  of  orthogonal  transmission  when  they  are  suboptimal. 
Technical  details  can  be  found  in  [13],  attached  in  the  appendix  of  this  final  report. 

2.4  MIMO  Communications  in  Airborne  Platforms 

Free  space  MIMO  had  not  been  a  fruitful  research  area  due  to  the  belief  that  the  lack  of  scattering 
prevents  us  from  harvesting  the  potential  throughput  gains  of  MIMO  communications.  However,  Dr. 
Cans,  using  realistic  settings  of  airborne  platforms,  demonstrated  that  free  space  MIMO  still  yields 
considerable  throughput  gains  that  warrant  a  serious  second  look.  Our  effort  is  to  develop  a  GUI  simu¬ 
lation  system  that  allows  one  to  visualize  the  throughput  comparison  of  various  airborne  communication 
scenarios.  The  GUI  system  will  described  in  details  in  the  next  chapter. 

2.5  Summary  of  Other  Contributions 

The  award  has  provided  (partial)  financial  support  to  several  graduate  students  over  the  past  years. 
Two  of  them,  Drs.  Ying  Lin  and  Bin  Liu,  have  since  graduated  and  have  taken  academic  appointments 
at  US  and  overseas.  This  award  was  also  instrumental  in  facilitating  close  collaboration  between  the 
PI  and  AFRL  researchers.  The  PI  has  visited  AFRL/Rome  Research  Site  numerous  times  during  the 
project  period.  Some  of  the  research  results  presented  in  this  report  result  from  direct  collaborations 
with  AFRL  researchers. 
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collaborators  that  he  has  had  the  fortunate  to  work  with  over  the  years,  including  Dr.  Michael  Gans, 
Dr.  John  Matyjas,  and  Dr.  Bruce  Suter.  Such  collaborations  were  not  only  fruitful  in  terms  generating 
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are  relevant  to  the  AF  and  DoD  at  large.  The  PI  has  benefited  greatly  by  many  discussions  with  Drs. 
Gans,  Matyjas,  and  Suter,  some  of  the  during  the  regular  coffee  breaks  at  the  lab.  The  PI  is  especially 
indebted  to  Dr.  Gans,  who  graciously  hosted  the  Pi’s  summer  visits  in  2004  and  2005.  Dr.  Gans’ 
technical  expertise  in  diverse  areas  ranging  from  communication  theory  to  antenna  to  hardware  design, 
his  valuable  insight  on  many  of  the  seemingly  complex  problems,  and  above  all,  the  scientific  rigor  with 
which  he  conducts  research  have  been  an  constant  inspiration  to  the  PI. 
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3  GUI  System  for  Throughput  Analysis 


of  Free  Space  MIMO  Systems 

3.1  Introduction 

Communication  systems  utilizing  multiple  antenna  elements  were  shown  to  provide  throughput 
improvement  of  several  magnitudes  compared  with  the  single  antenna  systems  [14].  The  throughput 
gain  is  attained  because  of  the  spatial  diversity  which  were  typically  believed  to  only  exist  in  a  rich 
scattering  environment.  As  such  there  had  been  much  doubt  on  the  suitability  and  relevance  of  MIMO 
communications  in  airborne  platforms  where  scatterers  are  hard  to  find. 

However,  using  realistic  aircraft  platforms,  it  was  demonstrated  in  [15]  that  free  space  MIMO 
communication  may  still  harvest  potentially  significant  throughput  gains  due  to  the  existence  of  spatial 
diversity  arising  from  the  large  aperture  of  transceiver  antenna  arrays.  This  work  largely  motivates  the 
development  of  this  simulation  platform  that  attempt  to  validate  the  throughput  potential  of  free  space 
MIMO  as  well  as  to  provide  insights  on  the  impact  of  the  existence  of  multiple  transceiver  pairs  to 
network  throughput  in  the  context  of  free  space  MIMO  system. 

The  purpose  of  any  simulation  is  to  provide  foresight  on  how  an  actual  system  might  function  when  put 
to  work.  Our  simulation  software  provides  a  comparison  on  the  data  rates  that  can  be  achieved  by  a 
Line  of  Sight  (LOS)  MIMO  system  using  different  transmission  schemes.  Here,  by  LOS  MIMO  systems  we 
refer  to  airborne  networks.  This  simulation  software  was  developed  in  MATLAB  version  7.0.0.19920 
(R14). 

The  different  transmission  schemes  that  we  consider  can  be  broadly  classified  into  two  categories,  viz., 
Channel  Blind  and  Channel  Aware.  For  the  Channel  Blind  case  we  put  equal  power  on  all  the  antennas. 

In  case  of  Channel  Aware  approach,  we  use  Beamforming  or  Waterfilling.  All  these  schemes  are 
compared  with  the  ergodic  capacity  that  can  be  achieved  under  suitable  channel  conditions.  Another 
feature  of  this  simulation  software  is  to  provide  a  visualization  of  the  trajectories  that  the  airborne 
objects  follow  and  the  corresponding  data  rates. 
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3.2  GUI  Overview 


The  simulation  software  GUI  consists  of  one  (1)  Main  (or  parent)  window  and  four  (4)  children  windows. 
There  are  two  primary  purposes  of  the  parent  window.  One,  to  invoke  the  children  windows  and  set  Jet 
parameters.  Second,  to  display  the  plots  in  different  configurations.  Other  purposes  that  the  parent 
window  serves  are  to  invoke  the  Animation  window  and  display  Jet  Parameters  window.  Jet  Parameters 
window  is  used  to  set  the  trajectory,  number  of  antennas,  antenna  configuration  and  speed  of  a  Jet. 
Channel  Parameters  window  is  used  to  set  the  communication  parameters  viz.,  the  carrier  frequency  & 
the  bandwidth.  Plot  and  Animation  window  is  used  for  data  rates  and  trajectory  visualization. 

For  the  given  start  and  end  co-ordinates  of  a  Jet,  it  is  assumed  to  move  in  a  straight  line  such  that  its 
length  is  oriented  along  the  line  joining  the  co-ordinates.  There  is  no  rolling  of  the  Jet  about  the  line 
joining  these  co-ordinates. 

3.2.1  Main  Window 


Figure  1:  Main  Window 

This  window  is  the  main  and  first  user  interface  that  the  simulation  software  provides  the  user  when  it  is 
run.  We  start  the  simulation  by  setting  the  Number  of  Jets  edit  box  and  then  pressing  Set  Jets  button, 
which  is  situated  above  the  edit  box.  This  invokes  Jet  Parameters  window  and  then  consequently 
Channel  Parameters  window.  After  the  user  is  done  with  inputting  the  required  parameters,  the 
application  starts  calculating  the  data  points.  These  data  points  can  be  plotted  in  3  different 
configurations  on  the  window,  i.e.,  1  Plot,  2  Plots  &  4  Plots. 

By  checking  the  Random  Trajectories  check  box,  the  application  will  fill  in  randomly  generated  values  in 
the  start  and  end  co-ordinates  of  the  Jets.  The  start  and  end  co-ordinates  are  generated  from 
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a  uniform  distribution  between  0  and  1.  Scaling  Factor  is  used  to  scale  them  up  to  the  desired  values. 

The  YAxis  and  Xy4x/s  popup  menu  are  used  to  select  the  data  the  corresponding  axes  should  plot.  The 
table  below  the  popup  menus  gives  the  Jet  parameters  for  the  selected  communication  pair. 

The  New  Window  button  is  used  to  export  the  selected  plot  to  a  new  axes  window.  This  allows  the 
user  to  access  the  features  that  are  available  in  the  axes  window,  which  are  not  available  on  the  Parent 
Window  axes. 

The  Start  Animation  button  invokes  Plot  and  Animation  window,  which  is  used  to  display  animation  of 
the  jets  along  with  their  corresponding  data  rates.  The  New  Window  and  Start  Animation  buttons  are 
disabled  until  the  data  points  are  available. 

The  different  plot  configurations  are  shown  in  figure  2  and  figure  3: 


Figure  2:  2  Plots 


3.2.2  Jet  Parameters  Window 

The  purpose  of  this  window  is  to  get  Jet  parameters  from  the  user.  When  the  window  is  invoked  by  Set 
Jets  button  on  parent  window,  it  passes  the  previously  used  jets  parameters  as  input  to  this  window. 
This  prevents  the  user  from  re-entering  the  parameters,  if  no  or  very  few  changes  are  to  be  made.  All 
the  fields  on  the  window  are  self-explanatory. 

We  enter  the  start  and  end  co-ordinates  for  the  jets  in  the  corresponding  fields.  The  Cruise  Speed  has 
to  be  set  in  km/hr.  The  antenna  co-ordinates  file  contains  the  antenna  co-ordinates.  Once,  all  these 
parameters  are  entered,  we  need  to  tell  the  application  which  other  jet  this  current  jet  intends  to 
communicate.  This  is  can  be  done  by  checking  the  appropriate  box  on  the  right  side  of  the  window.  Only 
one  box  can  be  checked  at  a  time. 

The  number  of  windows  that  are  opened  sequentially  depend  on  the  value  that  is  put  in  the  Number 
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Figure  3:  4  Plots 

of  Jets  edit  box  in  the  parent  window.  After  all  the  parameters  for  all  the  jets  are  set  Channel 
Parameters  Window  is  invoked. 

3.2.3  Channel  Parameter  Window 

This  window  is  used  to  get  the  communication  parameters  from  the  user.  Communication  can  take 
place  in  two  modes,  viz.  Fixed  Power  &  Fixed  SNR.  This  can  be  selected  by  checking  the  appropriate 
radio  button.  Similarly,  the  Frequency  and  Bandwidth  has  to  be  set  by  the  user.  These  parameters  apply 
to  all  the  communicating  pairs. 

Similar  to  Jet  Parameters  window,  the  Main  window  passes  previously  stored  channel  parameters 
while  invoking  this  window,  so  as  to  avoid  re-entering  them.  After  pressing  the  OK  button,  it  returns  a 
JetsPairParam  object  to  the  Parent  Window  and  the  main  application  starts  calculating  the  data  points. 

3.2.4  Plot  &  Animation  Window 

This  window  is  invoked  when  user  presses  Start  Animation  button  on  the  Main  window.  The  left  axes  on 
the  window  shows  the  trajectories  of  the  Jets.  Each  solid  blue  circle  indicates  the  Jet  position.  The  Jet 
number  and  role  it  is  playing  in  communication  i.e.  whether  it  is  acting  as  transmitter  or  receiver  (T  &  R 
respectively)  are  displayed  above  the  Jet  position.  The  solid  line  indicates  communicating  pair  and  the 
dashed  line  indicates  interference  from  other  transmitters. 

The  right  axes  shows  the  corresponding  data  rates  that  are  achieved  at  that  location. 

The  slider  is  used  to  control  the  position  of  the  Jets  in  time  domain.  After  pressing  the  play  button, 
slider  is  moved  periodically  in  forward  direction,  which  changes  the  Jet  positions,  thereby  giving  the 
effect  of  animation. 
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Figure  4:  Jet  Parameters 

3.2.5  Other  Options  Window 

This  window  can  be  opened  by  pressing  the  Options  button  on  the  Main  window.  Here,  the  Resolution 
edit  box  gives  the  number  of  data  points  that  need  to  be  calculated.  The  Scatter  Loss  gives  the  scaling 
parameter  for  the  scatter  matrix. 

For  our  simulation  purpose,  we  use  two  kinds  of  scatter  matrices.  One  is  the  deterministic  case.  In  this 
case,  we  set  the  scatter  points  on  the  jet  surface  and  calculate  the  scatter  matrix  accordingly.  Second  is 
the  random  case.  In  this  case,  we  consider  each  element  of  the  matrix  to  be  Complex  Gaussian 
distributed  with  mean  0  and  variance  1. 

3.2.6  Jet  Parameters  for  Random  Trajectories  Window 

This  window  is  invoked  when  the  Random  Trajectories  check  box  is  checked  and  the  user  presses  Set 
Jets  on  the  Main  window.  The  Cruise  Speed,  Number  of  Antennas  and  the  Antennas  File  Path  are 
common  to  all  the  jets  in  this  case. 

3.3  Formulae 

Let  Hij  be  the  channel  matrix  between  receiver  i  and  transmitter  j  as  calculated  in  [15],  Kjj  be  the 
transmit  covariance  matrix,  t  be  the  number  of  transmit  antennas  and  r  be  the  number  of  receive 
antennas.  We  assume  the  noise  to  be  complex  Gaussian  distributed  with  mean  0  and  variance  1. 
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Figure  5:  Channel  Parameters 


Figure  6:  Plots  and  Animation  Parameters 
3.3.1  Channel  Blind  MIMO  Rate 

In  this  approach  i.e.  when  the  transmitter  doesn't  know  the  Channel  State  Information  (CSI),  the 
transmitter  puts  equal  power  on  all  the  transmit  antennas,  such  that  the  transmit  covariance  matrix  is 


Ky  =  — I, ,  where  P  is  the  total  power  constraint  on  the  transmitter  and  It  is  a  ?  x  ?  identity  matrix. 


The  rate  at  receiver  i,  treating  interference  as  noise  is  given  below. 


j^Non-CSI 


=  log, 


.-1 


i+ph„k„h; 


i+X'7„h„k„h; 


l^i 


Here,  Hi/  is  the  hermitian  of  Hy  and  I  is  an  identity  matrix  of  size  r  x  r. 
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Figure  7:  Other  Options 

3.3.2  MIMO  Rate  2  (Beamforming) 

This  transmission  scheme  is  used  when  the  transmitter  knows  the  Channel  State  Information  such  that  it 
knows  the  right  dominant  eigenvector  of  the  channel  matrix  by  means  of  some  feedback  mechanism 
from  the  receiver.  Let  v;  be  the  right  dominant  eigen  vector  of  the  channel  matrix  Hu. We  set  the  channel 
covariance  matrix  Ku  as 


Kf  =  (2) 

We  calculate  the  correspoding  rate  by  plugging  in  this  in  the  above  capacity  expression. 

3.3.3  MIMO  Capacity  (WaterfilUng) 

We  calculate  the  waterfilling  matrix  Ku^'^from  the  given  channel  matrix  Hu,  and  plug  it  in  (1).  We  can 
use  this  transmission  scheme  only  when  the  transmitter  has  complete  channel  state  information. 

3.3.4  Rayleigh  Ergodic  Capacity 

For  this  case  we  use  the  formula  given  in  [15],  which  is  as  follows: 

CEr,o,ic  ^  ^|2  [l  +  p  +  ^]  _  (3) 

where,  <7  =  0.25(.^4/7  + 1  - 1],  3/  is  the  number  of  antennas  and  p  is  the  average  SNR  at  each 

receiver  antenna.  We  assume  same  number  of  antennas  at  the  transmitter  as  well  as  receiver. 
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Figure  8:  Jet  Parameters  For  Random  Trajectories 
3.3.5  SISO  Rate 


This  is  the  rate  for  a  single  input  single  output  case, 

+  (4) 

where,/)  is  the  average  SNR  at  the  receiver  antenna  and  rj  is  the  INR. 

3.3.6  Channelized  MIMO  Rate 

If  there  are  M communicating  pairs,  assuming  FDMA  or  TDMA  the  rate  for  each  link  is  given  by 


j^Ch-MIMO 


=  — log,I  +  AMH„K„H: 
M 


(5) 
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Abstract — We  consider  robust  signal  processing  techniques 
for  inference-centric  distributed  sensor  networks  operating  in 
the  presence  of  possible  sensor  and/or  communication  failures. 
Motivated  by  the  multiple  description  (MD)  principle,  we  de¬ 
velop  robust  distributed  quantization  schemes  for  a  decentralized 
detection  system.  Specifically,  focusing  on  a  two-sensor  system, 
our  design  criterion  mirrors  that  of  MD  principle:  if  one  of 
the  two  transmissions  fails,  we  can  guarantee  an  acceptable 
performance,  while  enhanced  performance  can  be  achieved  if 
both  transmissions  are  successful.  Different  from  the  conventional 
MD  problem  is  the  distributed  nature  of  the  problem  as  well 
as  the  use  of  error  probability  as  the  performance  measure. 
Two  different  optimization  criteria  are  used  in  the  distributed 
quantizer  design,  the  first  a  constrained  optimization  problem, 
and  the  second  using  an  erasure  channel  model.  We  demonstrate 
that  these  two  formulations  are  intrinsically  related  to  each  other. 
Further,  using  a  person-by-person  optimization  approach,  we 
propose  an  iterative  algorithm  to  find  the  optimal  local  quanti¬ 
zation  thresholds.  A  design  example  is  provided  to  illustrate  the 
validity  of  the  iterative  algorithm  and  the  improved  robustness 
compared  to  the  classical  distributed  detection  approach  that 
disregards  the  possible  transmission  losses. 

Index  Terms — Distributed  detection,  erasure  channels,  fading 
channels,  sensor  networks. 


1.  Introduction 

For  the  emerging  wireless  sensor  networks  (WSN),  dis¬ 
tributed  signal  processing  design  has  to  deal  with  various 
physical  limitations  imposed  by  severe  resource  constraints. 
For  example,  the  power  and  bandwidth  constraints,  coupled 
with  the  interference  and  channel  fading,  may  result  in 
transmission  loss  due  to  channel  outage.  In  addition,  low- 
cost  sensor  nodes  deployed  in  harsh  environments  may  be 
subject  to  sensor  failure,  making  them  unavailable  for  sens¬ 
ing/communication. 

A  conventional  approach  to  combat  transmission  loss  is 
to  exploit  channel  diversity  through  the  use  of  multiple  de¬ 
scription  (MD)  design  [1]  such  as  the  MD  codes  [2]  or  MD 
quantizers  [3].  This  MD  idea  is  illustrated  in  Fig.  1(a)  with  two 
encoders  and  three  decoders  [2] .  The  encoders  are  so  designed 
that  in  the  case  of  loss  of  one  of  the  two  transmissions,  the  side 
decoders  (Decoder  1  or  Decoder  2)  are  guaranteed  with  certain 
acceptable  performance;  if  both  transmissions  are  successful, 
the  central  decoder  output  (corresponding  to  Decoder  0)  will 
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(a) 


(b) 


Eig.  I.  Comparison  between  (a)  conventional  MD,  and  (b)  distributed  MD 
for  sensor  network  applications.  In  (a),  Encoders  I  and  2  have  access  to  the 
same  observation  X.  In  (b),  Encoder  I  encodes  Xi  without  access  to  X2 
while  Encoder  2  encodes  X2  without  access  to  Xi. 

have  enhanced  performance.  As  sensor  failure  can  be  dealt 
with  in  an  identical  fashion  under  the  MD  framework,  we 
will  no  longer  distinguish  the  two  types  of  losses,  one  due  to 
channel  outage  and  the  other  due  to  sensor  failure. 

To  carry  over  the  MD  principle  to  sensor  network  applica¬ 
tions,  care  must  be  taken  in  considering  the  distinct  features 
for  distributed  sensor  networks.  Two  of  the  critical  differences 
are  listed  below  and  are  what  motivate  the  current  work. 

•  Distributed  nature  of  WSN. 

In  the  conventional  MD  framework,  two  encoders  operate 
on  a  common  source.  In  WSN,  each  encoder  resides 
in  a  sensor  and  operates  only  on  its  own  observations 
without  access  to  the  other  sensor’s  observations.  This  is 
illustrated  in  Fig.  1. 

•  Inference-centric  nature  of  WSN. 

In  WSN  applications,  all  the  sensor  nodes  are  typically 
engaged  in  a  collective  inference  task.  The  ultimate  goal 
may  be  the  evaluation  of  some  underlying  state  instead 
of  recovering  the  sensor  observations.  In  reference  to 
Fig.  1(b),  the  goal  may  be  inferring  about  the  unknown 
parameter  0  instead  of  recovering  Xi  and  X2.  This  is 
in  comparison  with  the  conventional  MD  problem  where 
the  goal  is  to  recover  the  original  source  data.  A  direct 
consequence  is  that,  instead  of  using  the  conventional  dis¬ 
tortion  measures  in  the  traditional  MD  quantizer  design, 
other  performance  metrics  that  cater  toward  the  inference 
task  may  be  more  relevant. 

In  this  paper,  we  study  how  the  MD  principle  can  be  adapted 
to  inference-centric  applications  with  distributed  quantizer 
design.  By  focusing  on  a  binary  decentralized  hypothesis 
testing  problem  (i.e.,  0  is  binary  in  Fig.  1(b)),  we  investigate 
distributed  binary  quantizer  design  using  the  MD  principle. 
We  term  this  new  framework  distributed  multiple  descriptioQ 
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quantizer  (DMDQ)  design.  The  DMDQ  approach  achieve  ro¬ 
bust  inference  performance  in  the  presence  of  channel  outage 
or  sensor  failure  as  it  strikes  a  better  balance/tradeoff  between 
the  detection  performance  at  the  fusion  center  and  that  of  local 
sensors. 

The  proposed  scalar  quantizer  design  is  closely  related 
to  the  classical  distributed  detection  problem  [4],  [5]  as  it 
involves  the  design  of  multiple  sensor  decision  rules  that  are 
coupled  with  each  other.  Major  differences  exist,  and  the  most 
significant  is  that  we  no  longer  deal  with  a  single  objective 
function  (minimum  error  probability  at  the  fusion  center). 
Instead,  multiple  design  objectives  need  to  be  considered,  each 
corresponding  to  the  end-to-end  inference  performance  for  a 
particular  channel  outage  or  sensor  failure  state. 

To  explain  the  significance  of  the  proposed  approach,  and 
in  particular,  to  understand  its  improved  robustness  compared 
with  the  classical  distributed  detection  design,  consider  the 
following  simple  example.  Assume  a  binary  hypothesis  testing 
problem  with  a  two- sensor  parallel  fusion  system  where  each 
sensor  employs  a  binary  quantizer.  The  two  hypotheses  under 
test,  Hq  and  Hi,  are  a  priori  equally  likely.  The  local  sensor 
observations  at  the  two  sensors,  Xi  and  X2,  are  conditionally 
independent  and  identically  distributed  ternary  random  vari¬ 
ables  with 

P(Xfe  =  0|i/o)  =  0.95  f  P(Xfe=0|Pi)  =  0.05 

P(Xfe  =  l|i^o)  =  0.05  I  P{Xu  =  l\Hi)  =  0.9 

P{Xk  =  2|Po)  =  0  \  P{Xk  =  2|Pi)  =  0.05 


The  proposed  DMDQ  also  provides  an  alternative  approach 
to  the  channel  aware  design  for  a  decentralized  detection  prob¬ 
lem  [7]-[9]  in  dealing  with  imperfect  channels.  The  channel- 
aware  quantization  schemes  require  that  the  channel  state 
information  (CSI)  be  available  to  attain  optimum  performance. 
Acquiring  CSI,  however,  may  be  too  costly  in  systems  with 
stringent  resource  constraints.  It  is,  therefore,  imperative  to 
consider  quantizer  design  that  is  robust  to  potential  channel 
outages  without  the  knowledge  of  CSI.  The  proposed  DMDQ 
framework  is  an  initial  attempt  toward  robust  and  proactive 
signaling  for  distributed  sensor  networks  in  the  absence  of 
CSI. 

The  rest  of  the  paper  is  organized  as  follows.  In  the  next 
section,  we  describe  the  problem  formulation  and  introduce  the 
two-sensor  fusion  network  with  possible  transmission  losses. 
In  Section  III,  we  apply  the  Lagrangian  method  to  solve  the 
constrained  minimization  and  to  obtain  necessary  conditions 
for  optimum  binary  quantizers  in  the  form  of  LR  test  (LRT) 
thresholds.  In  Section  IV,  we  impose  the  discrete  memory  less 
erasure  channel  model  and  obtain  the  corresponding  optimum 
local  decision  rules  using  the  channel-aware  quantizer  design 
methodology  described  in  [7],  [8].  Numerical  results  are  pre¬ 
sented  in  Section  V  to  demonstrate  how  the  proposed  quantizer 
design  can  be  implemented  and  the  improved  robustness  over 
the  classical  distributed  detection  approach.  We  conclude  in 
Section  VI. 


for  k  =  1,2.  By  the  monotonicity  of  the  likelihood  ratio  (LR) 
in  the  sensor  observations  (i.e.,  the  local  sensor  LR  values  are 
monotone  in  X^),  we  need  to  consider  only  the  two  binary 
local  decision  rules  at  each  sensor  [6]: 


Rule  A  Uk 


1)  Xk=0 
1  Xk  =  1  or  2 


Rule  B  Uk 


t)  X/,  =  0  or  1 
a  Xfe  =  2 


Adopting  the  classical  distributed  detection  approach,  it  is 
straightforward  to  show  that  the  two  sensors  should  employ 
different  decision  rules  to  achieve  a  minimum  error  probability 
of  0.04875  at  the  fusion  center.  Assume  that,  without  loss  of 
generality,  sensor  1  uses  Rule  A  while  sensor  2  uses  Rule 
B.  If  sensor  I’s  decision  does  not  reach  the  fusion  center, 
either  due  to  a  channel  outage  or  a  sensor  failure,  the  actual 
minimum  error  probability  by  using  the  decision  from  sensor  2 
alone  becomes  0.475,  which  is  a  significant  degradation  from 
the  case  when  both  sensor  outputs  are  available.  This  error 
probability  essentially  renders  the  detection  system  essentially 
useless  as  it  is  close  to  0.5.  A  more  robust  design  is  to 
use  decision  rule  Rule  A  at  both  sensors.  In  this  case,  both 
the  fusion  center  and  each  local  sensor  have  identical  error 
probability  0.05  thus  there  is  no  degradation  in  the  event  of 
a  lost  transmission^  Compared  with  the  classical  distributed 
detection  approach  (whose  error  probability  pair  are  0.04875 
and  0.475),  the  alternative  approach  provides  a  more  robust 
performance  in  the  presence  of  a  transmission  loss. 


^This  simple  example  also  indicates  that,  depending  on  the  local  decision 


rules  used  and  the  observation  distributions,  having  more  sensors  in  the 
system  may  not  always  improve  the  overall  performance. 
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IT  Problem  Formulation 

Fig.  2  depicts  a  two- sensor  parallel  fusion  network  tasked 
with  a  hypothesis  testing  problem.  Each  sensor  collects  data 
that  are  generated  according  to  one  of  the  two  hypotheses  (Hq 
and  Hi)  under  test.  We  assume  in  the  present  work  that  the 
local  observations  Xi  and  X2  are  conditionally  independent 
given  the  underlying  hypothesis,  i.e.,  for  i  =  0, 1, 

f(Xi,x2m  =  f(Xi\Hi)f(X2m. 


It  is  easy  to  establish  that  with  this  conditional  independence 
assumption,  the  LR  pair  of  the  local  sensor  observations 


L{X2) 


form  a  sufficient  statistic  for  the  detection  problem. 

Based  on  its  local  observation  Xk,  the  kth  local  sensor 
implements  a  binary  quantizer  whose  output  Uk  ^  {1?  0}.  fof 
k  =  1,2,  will  be  sent  to  the  fusion  center.  The  transmission, 
however,  is  subject  to  channel  outage  or  sensor  failure.  When 
both  transmissions  are  successful.  Decoder  0  will  perform  as 
a  fusion  center  and  make  a  final  decision  on  which  hypothesis 
is  true  using  both  Ui  and  U2.  Otherwise,  if  only  one  of  the 
two  transmissions  is  successful,  either  Decoder  1  or  Decoder 
2  will  make  a  final  decision  based  on  the  successfully  received 
f//c.  In  our  current  work,  as  is  binary.  Decoders  1  and  2  will 
simply  take  Ui  and  f/2  as  their  respective  output,  as  illustrated 
in  Fig.  2. 

Adopting  a  Bayesian  framework,  we  use  error  probability  as 
the  performance  measure.  Define  Pek  the  probability  of  error 
at  Decoder  k\ 


Pek  =  T^oP{Uk  =  1\Hq)  +  7riP(C/fe  =  0|Fi),  fc  =  0, 1, 2 

(1) 
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Fig.  2.  A  two-sensor  parallel  fusion  network  with  possible  transmission  failures. 


where  tt^  =  P{Hj)  is  the  prior  probability  for  hypothesis 
Hj,  and  Uq  denotes  the  decision  output  for  Decoder  0.  Thus 
Peo  corresponds  to  the  error  probability  of  the  fusion  center 
when  both  Ui  and  U2  are  available  while  Pei  and  Pe2 
are  respectively  the  error  probabilities  at  individual  sensors. 
Classical  distributed  detection  theory  aim  to  minimize  Peo 
while  our  present  work  strives  for  a  balance  in  performance 
between  Pgo  and  Pek  for  /c  =  1,  2. 

Our  approach  is  derived  from  the  MD  principle  [1]:  we  aim 
to  design  sensor  decision  (quantization)  rules  such  that  if  one 
of  the  two  transmissions  is  lost,  an  acceptable  performance  (in 
terms  of  error  probability)  is  guaranteed;  if  both  transmissions 
are  successful,  a  better  performance  can  be  achieved.  Catering 
toward  the  hypothesis  testing  problem,  we  can  succinctly 
summarize  the  design  criterion  using  the  following  constrained 
minimization  problem 

min  PeO  ^2) 

subject  to  Pel  <  ei  and  Pe2  <  ^2-  ^ 

where  Si  and  €2  are  the  pre-specified  error  probabilities  that 
are  guaranteed  if  only  Ui  or  U2  is  successfully  received.  This 
design  criterion  is  reminiscent  of  the  MD  scalar  quantizer 
design  [3]  where  a  general  distortion  measure  is  used. 

III.  Necessary  Conditions  for  Optimality  and  a 
Design  Algorithm 

The  constrained  optimization  problem  readily  admits  a  La- 
grangian  formulation  which  is  used  to  solve  the  minimization 
problem  below.  The  Lagrangian  function  is  given  by 

L{ri^r2^  Ai,  A2)  =  PeO  +  Ai(Pel  —  £1)  +  \2{Pe2  ~  £2) 


Theorem  1:  Assume  that  the  two  local  observations,  X/^’s, 
are  conditionally  independent.  Further,  if  the  fusion  rule  and 
the  ki\\  local  sensor  decision  rule  satisfy,  for  /c  =  1 , 2 

{(P{Uo  =  l\Uk  =  hUj,)-P{Uo  =  l\Uk  =  ^^U-^)  >  0 
\Vm  =  0|Pi^  =  0,P^)-P(Po  =  0|P,  =  l,P^)  >  0 


where  k  =  3  —  k,  thus  1  =  2  and  2  =  1.  Then  the  optimum 
solution  of  the  constrained  minimization  problem  in  Eq.  (2) 
is  given  by  the  following  LRT,  for  /c  =  1 , 2 


P{Uk  =  l\Xk) 


1,  if 

0, 


piXklHi) 

piXk\Ho) 

Otherwise 


>  Tk 


(7) 


where  Tk,  the  optimal  LRT  threshold  for  the  kth  local  sensor, 
is  determined  as  follows: 

•  When  A/c  =  0  (inactive  constraint). 


xpAk 

xiBk 


(8) 


•  When  Xk  >  0  (active  constraint),  Tk  is  obtained  by 
solving 


Pek  ^k  —  0 


(9) 


The  associated  Xk  can  be  obtained  by 

TToAk  —  TTiBkTk 

'^k  =  - 

XlTk  —  TTo 

from  which  we  get. 


xo{Ak  +  Xk) 
7ri(P/c  +  Xk) 


(10) 


(11) 


The  quantities  Ak  and  Bk  in  Eqs.  (8-11)  are  defined 
respectively  as 


where  Tk  is  the  local  sensor  LRT  threshold,  Xk  is  the  La- 
grangian  multipliers,  for  /c  =  1,  2. 

Using  the  Kuhn-Tucker  theorem  [10],  the  set  of  optimum 
solution  of  the  constrained  minimization  problem  must  satisfy 
the  following  necessary  conditions,  for  /c  =  1,  2, 

Bk 


J2p{U-^\Ho) 

u-k 

[P{Uo  =  l\Uk  =  l,Uk)-P{Uo 


l\Uk  =  QTk)] 

(12) 


dPeO 

dTk 

dPel  dPe2 

+  Ai  +  A2  ^ 

dTk  dTk 

= 

0 

(3) 

Uk 

[P{Uo  =  0\Uk  =  0,  U-^)  -  P{Uo  =  0\Uk  =  1,  U-^)] 

A/c 

> 

0 

(4) 

(13) 

Pek  —  ^k 

< 

0 

(5) 

Theorem  1  is  proved  in  Appendix  1. 

^k{Pek  —  e/c) 

= 

0 

(6) 

Remarks: 

•  Note  that  the  forms  of  Tk,  Ak,  and  Bk  indicate  that  the 

Given  the  above  necessary  conditions,  the  optimum  solu 
tions  for  the  local  decision  rules  are  described  in  the  following 
theorem. 
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threshold  for  the  k\h  sensor  is  a  function  of  the  decision 
rule  at  the  other  sensor.  Thus,  as  expected,  the  optimal 
thresholds  at  sensor  1  and  2  are  coupled  with  each  other. 
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•  In  order  for  the  constrained  optimization  to  have  feasible 
solutions,  Cl  and  62  can  not  be  chosen  to  be  too  small. 
Specifically,  Ck  needs  to  be  no  smaller  than  the  minimum 
achievable  error  probability  at  sensor  k.  More  discussions 
about  this  can  be  found  in  Section  V  after  we  introduce  an 
alternative  design  approach  (Approach  3  in  Section  V). 

•  Peo  is  the  achievable  error  probability  using  both  U 1  and 
U2.  On  the  other  hand,  Pei  and  Pe2  are  respectively 
the  error  probabilities  at  local  sensors,  each  associated 
with  Ui  or  U2.  Thus  Peo  <  min{Pei,Pe2}  where  the 
inequality  is  due  to  the  fact  that  one  can  simply  ignore 
one  of  {Pi,  P2}  and  the  error  probability  should  thus  be 
no  worse  than  either  Pei  or  Pe2.  Due  to  the  constraints 
Pel  <  ^1  and  Pe2  <  C2,  we  have 

Peo  <  min{Pei,-Pe2}  <  min{ei,e2} 

That  is,  the  error  probability  achieved  when  both  trans¬ 
missions  are  successful  is  upper  bounded  by  the  error 
probability  constraints  at  local  sensors. 

.  If  Pel  61  and  Pe2  ^  ^29  i-c.,  A/j  —  0  for  k  —  1,2,  Peo 
is  the  minimum  error  probability  that  can  be  achieved  at 
the  fusion  center.  The  constrained  optimization  approach 
yields  the  same  result  as  the  unconstrained  approach  that 
minimizes  the  error  probability  at  the  fusion  center.  This 
happens  when  the  constraints  Ck  are  large  enough. 

•  Eq.  (11)  is  a  unifying  expression  of  the  optimal  local  LR 
threshold  for  the  two  cases  of  Xk  >  0  and  Xk  =  0. 

m  The  conditions  described  in  Theorem  1  do  not  admit 
closed-form  solutions.  Simultaneously  optimizing  ti  and 
T2  is  intractable  due  to  the  distributed  nature  -  it  typically 
involves  some  exhaustive  search  over  a  two  dimension 
space  for  the  (ri,r2)  pair.  However,  the  necessary  con¬ 
ditions  established  in  Theorem  1  allows  us  to  adopt  a 
person-by-person  optimization  (PBPO)  approach  where 
each  threshold  is  optimized  assuming  fixed  threshold  at 
the  other  sensor.  The  PBPO  approach  has  been  widely 
used  in  optimizing  decentralized  systems,  and  in  partic¬ 
ular,  in  the  classical  distributed  detection  (see,  e.g.,  [11], 
[12])  when  joint  optimization  is  typically  intractable. 

•  Theorem  1  describes  necessary  conditions  for  the  op¬ 
timum  LRT  thresholds;  thus  multiple  initializations  are 
needed  to  find  the  globally  optimum  thresholds. 

The  following  iterative  algorithm  describes  this  PBPO  pro¬ 
cedure. 

Iterative  Algorithm 

•  Step  1.  Initialize  t/c,  for  /c  =  1,  2. 

•  Step  2.  Obtain  the  optimum  fusion  rule  for  fixed  ri  and 

T2. 

•  Step  3.  For  fixed  fusion  rule  and  r2,  calculate  ri  using 

(8). 

•  Step  4.  Check  to  see  if  ti  satisfies  Pei  —  ci  <  0. 

“  If  yes,  go  to  Step  5. 

“  If  no,  calculate  ri  using  (9). 

•  Step  5.  For  fixed  fusion  rule  and  ri,  calculate  T2  in  a 
similar  fashion. 

•  Step  6.  Check  convergence,  i.e,  if  the  obtained  ri  and  T2 
are  identical  (up  to  a  prescribed  tolerance)  to  that  from 
the  previous  iteration. 


Fig.  3.  A  discrete  memoryless  erasure  channel  model  for  the  channel  between 
sensor  k  and  the  fusion  center. 


-  If  yes,  stop. 

-  Otherwise,  go  to  Step  2. 

At  each  iteration,  Tk  is  optimized  for  a  given  fusion  rule  and 
the  other  threshold  r^,  hence  the  error  probability  is  monotone 
decreasing  until  a  stationary  point  is  reached. 

IV.  Optimal  Focal  Decision  Rule  Design  Under  an 
Erasure  Channel  Model 

The  constrained  minimization  approach  provides  a  proactive 
design  methodology  that  avoids  severe  performance  degrada¬ 
tion  in  the  absence  of  CSI.  We  propose  in  this  section  an 
alternative  approach  by  imposing  a  certain  parametric  model 
on  the  channel/sensor  failures.  This  allows  us  to  adopt  existing 
channel  aware  approach  [8]  to  design  the  local  quantizers. 
Similar  to  [13],  we  model  the  potential  transmission  loss  using 
erasure  channels  where  the  erasure  accounts  for  possible  sen¬ 
sor  failures/channel  outages.  This  channel  model  is  illustrated 
in  Fig.  3  where  5k  =  P{Xk  =  E\Uk)  is  the  erasure  probability 
corresponding  to  sensor  k.  Our  alternative  optimization  crite¬ 
rion  is  to  minimize  the  average  error  probability  Pg,  defined 
as 

Pe  =  (1  —  ^l)(l  —  ^2)PeO  +  ^2(1  —  ^l)Pel 

+  ^1(1  -  52)Pe2  +  min{7ro,7ri}^i^2  (14) 

where  the  last  term  corresponds  to  the  error  probability  when 
both  transmissions  are  lost.  This  constant  term  has  no  effect 
on  the  quantizer  design,  hence  can  be  dropped  in  the  design 
problem. 

The  following  theorem  provides  the  solution  for  the  sensor 
decision  rules  that  minimize  Pg. 

Theorem  2:  Assume  that  the  two  local  observations,  X/^’s, 
are  conditionally  independent  and  channels  are  independent 
discrete  memory  less  erasure  channels.  Further,  if  the  fusion 
rule  and  the  k\h  local  sensor  decision  rule  satisfy,  for  /c  =  1,2 

UP{Uo  =  l\Uk  =  hUk)-P{Uo  =  l\Uk  =  ^^Uk)  >  0 
\Ip(Po  =  0|P,  =  0,P^)-P(Po  =  0|P,  =  1,P^)  >  0 

where  k  is  defined  similar  as  in  Theorem  1 .  Then  the  optimum 
local  rule  for  the  ki\\  sensor  amounts  to  the  following  FRT, 
for  /c  =  1,  2 

1  PXk\Hi)  ^  7ro(Afc+Qfc) 

=  „  „,hSS">  - 
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where 

Qfi  =  - - —  and  0^2  =  (16) 

1  —  O2  1  —  0i 

and  Ak  and  are  defined  in  Eqs.  (12)  and  (13). 

A  proof  is  given  in  Appendix  11.  Following  the  same  spirit 
of  the  iterative  algorithm  in  Section  III,  we  can  devise  a  similar 
procedure  to  find  the  optimal  thresholds  using  Theorem  2. 

Comparing  Eqs.  (11)  and  (15),  we  have  some  interesting 
observations  that  suggest  intrinsic  connections  between  the 
erasure  channel  model  and  the  constrained  minimization  for¬ 
mulation.  From  Eq.  (14),  if  we  drop  the  last  term  and  divide 
the  average  probability  by  (1  —  ^i)(l  —  ^2),  the  new  function 
to  be  minimized  becomes 


Q  —  Pe{)  +  OLiPei  +  0^2Pe2‘ 


with  ai  and  0^2  defined  as  in  Eq.  (16).  The  design  problem 
reduces  to  a  problem  of  minimizing  Q  subject  to  ai  >  0  and 
a2  >  0.  Compare  this  with  Eq.  (3),  we  see  that  ak  plays  a 
similar  role  as  the  Lagrangian  multiplier  Xk. 

Further  more,  the  first-order  necessary  conditions  for  min¬ 
imizing  Q  are  given  by: 


dPeO 

dTk 


+  a\ 


dPel 

drk 


+  a2 


dPe2 

drk 

o^k 


=  0 
>  0 


(17) 

(18) 


Comparing  Eq.  (17)  and  (18)  to  Eq.  (3-6),  we  notice  that 
these  two  formulations  are  similar  except  that  the  constrained 
optimization  approach  has  more  restrictive  constraints  (Eq.  (5) 
and  (6)).  Next  we  elaborate  when  these  two  formulations  will 
have  identical  optimal  solutions. 

Consider  the  first  case:  when  Xk  =  0,  i.e.,  the  constraints 
Pek  <  are  satisfied.  In  this  case,  set  ak  =  Xk  =  0,  and  the 
two  formulations  have  the  same  optimal  thresholds.  The  case 
of  Xk  >  0  is  more  complicated.  With  Xk  >  0,  we  have  Pek  = 
e/c,  /c  =  1,  2.  Assume  the  erasure  channel  model  yields  L  local 
minima,  with  the  corresponding  threshold  pair  (r|,r2),  I  = 
1,2,...,!/,.  Denote  by  j  =  0, 1,  2,  the  error  probabilities 
associated  with  (t|,  T2).  By  virtue  of  the  problem  formulation, 
there  must  exist  one  (t{^,  r^)  whose  local  error  probabilities 
satisfy  P^  =  ek,  k  =  1,  2.  If 

Qm  A  pm  ^  ^  <  p7^  ^  ^  ^^pj^  A  QJ  (^9) 

for  j  7^  m,  j  =  1,2,..,!/.  Then  (t{^,t^)  is  the  optimal 
solution  for  both  constrained  minimization  formulation  and 
the  erasure  channel  formulation.  We  will  further  illustrate 
these  connections  using  some  numerical  examples  in  the  next 
section. 


V.  A  Numerical  Example 

In  this  section,  we  use  several  numerical  examples  to 
highlight  the  robust  performance  of  the  proposed  local  quan¬ 
tizer  design  compared  with  the  classical  distributed  detection 
approach.  Consider  the  detection  of  a  known  signal  in  inde¬ 
pendent  Gaussian  noises  using  two  sensors: 

Hq  :  Xk  =  rik 

Hi  :  Xk  =  s-^Uk 


where  5  is  a  known  signal  and  Uk  is  zero  mean  Gaussian  noise 
with  variance  for  k  =  1,2.  Without  loss  of  generality,  we 
assume  5  =  1  and  =  1.  Each  local  sensor  makes  a  binary 
decision  using  its  observation  Xk  and  a  decision  rule  7/^,  i.e., 
Uk  =  "fk{Xk)  ^  The  transmission  ofUk,  however,  is 

subject  to  channel  losses.  If  both  Ui  and  U2  are  successfully 
received.  Decoder  0  will  implement  the  maximum  a  posteriori 
probability  decoding  (detection)  rule,  i.e., 

P(UlP2|^l)  ^  7:0 

P(Po  =  f\Ul,  U2)  =  ’  P(Ui,C/2|77o)  -  tti  (20) 

0,  otherwise 

For  simplicity,  we  consider  a  symmetric  setting  where  we 
use  identical  error  probability  constraints  (i.e.,  ei  =  62)  for 
the  constrained  minimization  approach,  and  identical  erasure 
probabilities  (i.e.,  =  ^2)  for  the  erasure  channel  model 

approach. 

In  addition  to  the  proposed  approaches,  we  also  present 
results  using  alternative  approaches  to  highlight  the  robustness 
of  the  proposed  MD  principle  based  framework.  The  complete 
list  of  approaches  used  in  the  simulations  is  as  follows. 

Approach  1  Constrained  minimization  described  in  Sec¬ 
tion  III  (Theorem  1). 

Approach  2  Erasure  channel  model  approach  described 
in  Section  IV  (Theorem  2). 

Approach  3  Minimizing  the  local  error  probabilities  Pel 
and  Pe2-  We  denote  by  Pe/c,3  (^  =  1,2)  the  minimum 
achievable  local  error  probabilities,  and  Peo,3  the 
corresponding  error  probability  at  Decoder  0,  respec¬ 
tively.  Note  that  Pe/c,3  provides  the  lower  bound  for 
the  local  error  probability  constraint  e/^,  i.e.,  one  must 
have  Ck  >  Pek, 3  for  the  constrained  minimization 
formulation  to  have  a  solution. 

Approach  4  Minimizing  the  error  probability  at  the  fusion 
center.  We  denote  by  Peo,4  the  minimum  achievable 
error  probability  at  Decoder  0,  and  Pek, 4  (k  =  1,2) 
the  corresponding  local  error  probabilities,  respec¬ 
tively.  This  approach  corresponds  to  the  classical 
distributed  detection  with  a  single  objective  function. 
An  interesting  observation  is  that  this  approach  can 
be  considered  as  a  special  case  of  the  erasure  channel 
model  with  5k  =  0,  for  /c  =  1,2.  As  such,  one 
only  need  to  minimize  Peo  as  both  transmissions  are 
always  assumed  successful. 

Notice  that  Approaches  3  and  4  are  conflicting  with  each  other: 
one  can  show  that  optimizing  Pgo  and  Pek  for  /c  =  1,2  can 
not  be  simultaneously  achieved  [14].  Otherwise,  the  entire 
distributed  MD  framework  will  become  trivial  as  one  can 
simultaneously  optimize  the  local  error  probability  and  that 
of  Decoder  0  (the  fusion  center). 

As  we  are  considering  a  Gaussian  problem,  the  obtained 
LRT  thresholds  at  the  sensors  can  be  directly  translated  into 
thresholds  for  the  original  observations.  Thus  in  the  following 
presentation,  we  will  use  thresholds  for  the  original  observa¬ 
tions,  denoted  hy  pk  for  /c  =  1,  2. 

The  numerical  results  are  summarized  in  Tables  I-III  as  well 
as  in  Figs.  4-6.  Specifically, 

•  Tables  I  and  II  enumerate  respectively  the  parameters  and 
the  obtained  thresholds  and  error  probabilities  of  the  two 
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TABLE  I 

Thresholds  and  Error  Probabilities  Obtained  Using  Approach  I 


TTO  =  0.6 

4 

o 

II 

p 

bo 

Ik 

Pek 

FeO 

Xk 

^k 

Ik 

Pek 

FeO 

Xk 

0.33 

0.2629 

0.33 

0.2574 

0.0177 

0.25 

0.8474 

0.2466 

0.1686 

0 

0.32 

0.3609 

0.32 

0.2591 

0.1652 

0.21 

1.1737 

0.21 

0.1744 

0.1788 

0.31 

1.2895 

0.3047 

0.2632 

0.0001 

0.2 

2.0292 

0.1866 

0.1775 

0.001 

0.30 

1.1812 

0.30 

0.2649 

0.4023 

0.19 

2.0292 

0.1866 

0.1775 

0.001 

TABLE  II 

Thresholds  and  Error  Probabilities  Obtained  Using  Approach  2 


TTO  =  0.6 

4 

o 

II 

o 

bo 

^k 

Ik 

Pek 

FeO 

^k 

^k 

Ik 

Pek 

FeO 

<^k 

0.0174 

0.2629 

0.33 

0.2574 

0.0177 

0.01 

0.8667 

0.2438 

0.1686 

0.0101 

0.2869 

1.1812 

0.3 

0.2649 

0.4023 

0.4 

1.9757 

0.1864 

0.1777 

0.8 

0.5 

1.0971 

0.2973 

0.2685 

1.0 

0.5 

1.9616 

0.1863 

0.1778 

1 

0.7 

1.018 

0.2955 

0.2738 

2.3333 

0.7 

1.9324 

0.1863 

0.1780 

2.3333 

0.9 

0.9417 

0.2946 

0.2807 

9.0 

0.9 

1.902 

0.1862 

0.1784 

9.0 

TABLE  III 

Thresholds  and  Error  Probabilities  Obtained  Using 
Approaches  3  and  4 


Approach  3 

r]k 

Pek, 3 

FeO,  3 

TTo  =  0.6 

0.9059 

0.2945 

0.2846 

4 

o 

II 

p 

bo 

1.8863 

0.1862 

0.1787 

Approach  4 

rjk 

Pek, A 

FeO,  4 

TTo  =  0.6 

0.2495 

0.3315 

0.2574 

4 

o 

II 

p 

bo 

0.8474 

0.2466 

0.1686 

proposed  approaches  (Approaches  1  and  2). 

•  Tables  III  gives  the  obtained  thresholds  and  error  proba¬ 
bilities  of  the  two  alternative  approaches  (Approaches  3 
and  4). 

•  Figs.  4  and  5  give  the  analytically  calculated  error  proba¬ 
bilities  (both  of  the  fusion  center  and  local  sensors)  versus 
threshold  plots  with  two  different  priors,  ttq  =  0.6  and 
TTo  =  0.8,  respectively.  In  each  plot,  (b)  is  a  zoom-in  of 
(a)  for  better  visualization. 

•  Fig.  6  is  the  error  probability  versus  erasure  probability 
plot. 

Our  observations  from  the  numerical  results  are  summarized 
below. 

•  For  Approach  1,  the  iterative  algorithm  indeed  yields 
thresholds  that  are  solutions  to  the  constrained  optimiza¬ 
tion  problem.  For  example,  with  ttq  =  0.6  and  error 
probability  constraint  ek  =  0.3,  the  threshold  obtained 
using  Approach  1  is  r]k  =  1.1812  with  corresponding 
error  probabilities  Pgo  =  0.2649  and  Pgi  =  Pg2  = 
0.30  (the  left  half  of  the  last  row  of  Table  I).  This 
is  consistent  with  Fig.  4  (the  corresponding  values  are 
marked  on  Fig.  4(b)).  Similarly,  with  ttq  =  0.8  and 
ek  =  0.2,  the  minimum  achievable  Pgo  =  0.1775  with 
the  corresponding  threshold  r]k  =  2.0311  and  local  error 
probability  Pek  =  0.1866.  These  values  are  marked  on 
Fig.  5  and  are  consistent  with  those  listed  in  Table  I  (right 
half  of  the  last  row). 

•  From  Table  I,  it  can  be  seen  that,  by  comparing  columns 
corresponding  to  Pek  and  Pgo ,  smaller  local  sensor  error 


TABLE  IV 

Local  Minima  Obtained  by  the  Erasure  Channel  Model 
Approach,  =  0.32,  ak  =  Xk  =  0.1652 


^k 

rik 

Pek 

FeO 

Q 

0.1418 

1.231 

0.3023 

0.2636 

0.3635 

0.1418 

0.3609 

0.32 

0.2591 

0.3648 

probabilities  typically  result  in  larger  error  probability  at 
the  fusion  center.  In  general,  having  a  generous  constraint 
on  local  sensor  error  probabilities  (large  Ck)  imposes 
less  restriction  on  the  admissible  threshold  pairs,  which 
typically  gives  rise  to  smaller  Pgo.  In  the  extreme  case, 
for  example,  when  Ck  =  0.5,  the  obtained  thresholds  will 
always  coincide  with  that  of  Approach  4. 

The  classical  distributed  detection  (Approach  4)  that 
minimizes  error  probability  at  the  fusion  center  suffers 
significant  performance  loss  in  the  event  of  a  lost  trans¬ 
mission.  This  can  be  illustrated  using  Fig.  4(b)  along 
with  Table  III.  At  ttq  =  0.6,  Approach  4  yields  a 
globally  minimum  error  probability  Pgo  =  0.2574  at 
the  fusion  center.  However,  if  one  of  the  transmission  is 
lost,  the  error  probability  suffers  a  significant  degradation 
to  Pek  =  0.3315  (marked  on  the  dash-dotted  curve). 
Clearly,  the  constrained  optimization  approach  is  much 
more  robust  (a  degradation  from  Pgo  =  0.2649  to 
Pek  =  0  .30).  This  effect  is  even  more  pronounced  for 
the  case  of  ttq  =  0.8.  Approach  4  yields  a  fusion  center 
error  probability  Pgo  =  0.1686  (corresponding  to  the 
minimum  point  of  the  solid  curve  in  Fig.  5(b)).  However, 
if  only  one  transmission  reach  the  fusion  center,  the  error 
probability  becomes  Pek  =  0.2466  which  essentially 
renders  this  system  useless  -  as  the  prior  probability 
is  TTo  =  0.8,  the  error  probability  should  be  capped  at 
0.2.  This  seemingly  pathological  result  is  due  to  the 
fact  that  the  threshold  design  at  local  sensors  for  the 
classical  distributed  detection  always  assumes  successful 
transmissions  from  other  collaborating  sensors. 
Approach  3  which  optimizes  local  sensor  performance 
does  not  have  significant  improvement  when  both  trans- 
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Tl=Tli=Tl2  (local  observation  thresholds) 

(a) 


(b) 

Fig.  4.  Analytically  calculated  error  probability  versus  threshold  plot  for 
TTo  =0.6;  (b)  is  a  zoom-in  of  (a). 


ri=ri^=ri2  (local  observation  thresholds) 

(b) 


Fig.  5.  Analytically  calculated  error  probability  versus  threshold  plot  for 
TTO  =  0.8;  (b)  is  a  zoom-in  of  (a). 


TABLE  V 

Local  Minima  Obtained  by  the  Erasure  Channel  Model 
Approach,  =  0.3,  ak  =  Xk  =  0.4023 


Pek 

PsO 

Q 

0.2869 

0.4784 

0.3101 

0.2645 

0.514 

0.2869 

1.1812 

0.3 

0.2649 

0.5063 

missions  are  successful.  From  Table  III  and  Fig.  4,  for 
TTo  =  0.6,  the  minimum  local  sensor  error  probability 
is  Pel  =  Pe2  =  0.2945.  When  both  transmissions  are 
successful,  the  fusion  center  will  have  an  error  probability 
Peo  =  0.2846,  which  is  only  marginally  better  than 
the  individual  sensor’s  performance.  This  improvement 
is  much  smaller  than  that  achieved  by  the  proposed 
constrained  minimization  approach. 

•  For  the  erasure  channel  model  approach,  as  the  erasure 
probability  6k  approaches  one,  the  obtained  optimal  local 
thresholds  converge  to  that  obtained  using  Approach  3 
(minimizing  the  local  error  probabilities).  This  can  be 
seen  by  comparing  Tables  II  and  III:  the  thresholds  ob¬ 
tained  using  Approach  2  will  approach  that  of  Approach 
3  sls  6k  increases.  This  is  expected  since  large  6k  implies 
that  the  channel  is  likely  to  break  down,  thus  the  local 
error  probability  will  dominate  the  system  performance. 


On  the  other  hand,  as  the  erasure  probabilities  approach 
zero,  the  obtained  optimal  local  thresholds  converge  to 
those  that  minimize  the  error  probability  at  Decoder 
0  (corresponding  to  Approach  4).  Intuitively,  small  6k 
indicates  a  high  probability  of  successful  transmissions 
of  both  Ui  and  U2.  Thus,  the  error  probability  at  Decoder 
0  would  largely  determine  the  system  performance.  The 
same  behavior  can  be  observed  from  Fig.  6,  plotted 
for  TTo  =  0.6,  by  looking  at  the  two  extreme  points 
corresponding  to  =  0  and  6k  =  1.  The  associated 
error  probabilities  coincide  with  that  of  Approach  4  and 
3  respectively. 

•  We  have  explored  the  intrinsic  connections  between  Ap¬ 
proach  1  and  2  in  Section  IV.  Now  we  present  numerical 
results  to  further  elaborate  the  connections.  Consider  the 
case  of  TTo  =  0.6. 

-  With  €k  =  0.32,  the  corresponding  Xk  =  0.1652. 
Set  ak  =  0.1652,  we  obtain  two  local  minima  that 
satisfy  Eq.  (17),  as  listed  in  Table  IV.  Since  we  want 
to  choose  the  thresholds  that  minimize  Q,  it  turns 
out  that  r]i  =  r]2  =  1.237  (with  Pgi  =  Pe2  =  0.3023 

and  Peo  =  0.2636)  is  the  optimal  solution  for  the 
erasure  channel  model  approach.  But  the  constrained 
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5=5^=52  (erasure  probabilities) 

Fig.  6.  Error  probability  versus  erasure  probability  plot  for  tto  =  0.6 
obtained  using  the  channel-aware  quantization  for  the  erasure  channel  model. 


minimization  approach  results  in  the  thresholds  771  = 
r]2  =  0.3609  (with  Pgi  =  Pe2  =  0.32  and  PgO  = 
0.2519).  From  Table  IV,  it  is  clear  that  Eq.(19)  does 
not  hold,  i.e.,  the  Q  function  corresponding  to  Pek  = 
0.32  is  not  the  smallest  among  the  two.  Hence  in  this 
particular  setup,  these  two  approaches  do  not  have 
the  same  optimal  solution. 

-  Now  we  examine  a  case  when  the  two  formulations 
share  the  same  solution.  Consider  Ck  =  0.3,  the 
corresponding  Xk  =  0.4023.  Set  ak  =  0.4023,  again 
there  are  two  local  minima  as  listed  in  Table  V 
obtained  using  the  erasure  channel  model  approach. 
We  notice  that  rji  =  772  =  1.1812  is  the  optimal 
solution  for  both  approaches  and  it  is  easy  to  check 
that  Eq.(19)  is  satisfied. 

•  In  general,  the  rate  of  convergence  of  the  proposed 
iterative  algorithm  depends  on  the  initial  values  of  local 
thresholds.  Our  simulations  indicate  that  the  proposed 
iterative  algorithm  converges  very  fast.  For  all  the  sce¬ 
narios  we  have  examined,  convergence  happens  typically 
after  several  (<  10)  iterations.  For  instance,  the  results 
in  Table  I  were  obtained  after  about  six  iterations  on  the 
average. 


VI.  Conclusions 

In  this  paper,  we  developed  robust  signal  processing  tech¬ 
niques  for  distributed  sensor  networks  applications.  In  partic¬ 
ular,  we  presented  a  distributed  multiple  description  quantiza¬ 
tion  (DMDQ)  framework  for  the  design  of  sensor  signaling 
in  the  presence  of  sensor  failures/channel  outages.  Two  ap¬ 
proaches  are  proposed  to  address  the  DMDQ  design  using  a 
two-sensor  distributed  detection  problem.  The  first  scheme  is 
based  on  a  constrained  minimization  approach;  and  a  solution 
using  Lagrangian  multiplier  is  presented.  The  second  imposes 
a  discrete  erasure  channel  model;  we  developed  the  channel- 
aware  quantizer  design  that  minimizes  the  average  error  prob¬ 
ability.  Iterative  algorithms  were  constructed  in  search  of  the 
optimal  thresholds.  The  intrinsic  connections  between  to  the 
two  approaches  were  explored.  A  design  example  was  used  to 
show  how  the  DMDQ  can  be  implemented  in  a  real  distributed 


detection  problem,  and  to  demonstrate  its  robust  performance 
compared  with  the  classical  distributed  detection  approach  in 
the  presence  of  possible  transmission  losses. 

Our  future  work  will  address  the  application  of  the  MD 
principle  to  sensor  networks  involving  more  than  two  sensors. 
The  problem  becomes  conceivably  much  more  complex  as  the 
number  of  objective  functions  grow  exponentially  as  the  num¬ 
ber  of  sensors.  Thus  the  constrained  minimization  approach 
may  not  be  feasible.  On  the  other  hand,  the  erasure  channel 
model  essentially  collapses  the  multi-objective  functions  into 
a  single  error  probability,  making  it  more  appealing  in  dealing 
with  large  sensor  networks.  Thoroughly  understanding  the 
connection  between  the  constrained  minimization  problem  and 
the  erasure  channel  model  will  provide  valuable  insight  in  how 
to  choose  the  erasure  channel  model  parameters. 


Appendix  I 
Proof  of  Theorem  1 

Without  loss  of  generality,  we  expand  Peo  with  respect  to 
Ui  and  rewrite  Peo  as  the  form  in  Eq.  (21),  where  Ai  and  Bi 
are  defined  in  Eqs.  (12)  and  (13),  and 


Cl  =  ^0  P{U2\Ho)P{Uo  =  l\Ui  =  0,U2) 

U2 

PTTI  P{U2\H^)P{Uo  =0\Ui=  0,C2) 

U2 

C2  can  be  similarly  defined  by  swapping  the  roles  of  Ci  and 
U2.  Without  loss  of  generality,  we  can  rewrite  PeO  as: 

PeO  =  7roP{Uk  =  l\Ho)Ak  —  7riP{Uk  =  l\Hi)Bk  +  Ck 


for  /c  =  1,  2. 

From  Eq.  (1),  the  local  error  probabilities  can  be  expressed 
as  Pek  =  TToPf/c  +7ri(l  -  Pdk),  where  Pfk  =  P{Uk  =  l|Po) 
and  Pdk  =  P(Uk  =  Thus  the  left-hand  side  of  Eq.  (3) 

becomes 


dPeO  ,  ^  d{Pei  —  €i) 

dTk  dTk 

Z  =  1 


=  TToAk 


=  TToAk 


dP, 


fk 


dTk 

dPfk 

dTk 


-  TTiBk 


dPdk 

dTk 


+  Afe(7ro 


dP, 


fk 


TT 1  BkTk 


dP, 


fk 


dTk 


+  Xk^TTo 


dTk 

dPfk 


TTl- 


dPdk  > 
dTk  ' 


dTk 


—  TTiT/c- 


dP, 


fk 


dTk  ^ 
(22) 


where  we  have  used  the  fact  that  =  t/c,  and  Tk  is  the  LR 

<T-rfk 

threshold  for  the  kth  sensor. 

Set  (22)  equal  to  zero,  we  have  Tk  = 

11)  follow  by  directly  applying  the  Kuhn-Tucker  theorem  for 
the  two  cases  A/c  =  0  and  Xk  >  0  separately.  Thus,  Theorem 
1  is  proved. 


Appendix  II 
Prooe  of  Theorem  2 

Similar  to  the  proof  in  Appendix  I,  Pgo  can  be  expanded 
with  respect  to  the  individual  decision  rules,  and  we  get,  for 

^This  is  the  property  of  the  receiver  operating  characteristics  (ROC)  curve 
for  a  likelihood  ratio  test.  The  threshold  corresponding  to  the  (Pf,Pd)  pair 
equals  the  slope  of  the  ROC  curve  at  that  point. 
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Peo  =  7roP{Uo  =  l\Ho)^7riP{Uo  =  0\Hi) 

=  TTO  P{Uo  =  l|Pl,  U2)P{Uu  U2\Ho)  +  P(Po  =  0|Pi,  U2)P{Uu  P2IP1) 

Ui  U2  Ui  U2 

=  TTO  P{U2\Ho)[P{Uo  =  1|Pi  =  1,  U2)P{Ui  =  l\Ho)  +  P(Po  =  l|Pi  =  0,  U2)P{Ui  =  0|Po)] 

U2 

P(P2|Pl)[P(Po  =  0|Pi  =  1,  P2)P(Pl  =  l|Pl)  +  P(Po  =  0|Pi  =  0,  P2)P(Pl  =  0|Pi)] 

U2 

=  =  l|Po)^i  -  =  l|Pi)Pi  +  Cl  (21) 


A:  =  1,2, 

PeO  =  [  [7roAkP{Xk\Ho)  -  7riBkP{Xk\Hk)] 

P{Uk  =  l\Xk)dXkPCk  (23) 

where  Ck  has  no  effect  on  the  decision  rule  at  sensor  k. 

Similarly,  the  error  probability  at  the  ki\\  sensor  can  be 
expanded  as 


Pek 


TToPiUk  =  l\Ho)P7riP{Uk  =  0|Pi) 

[  [7roP{Xk\Ho)-niP{Xk\Hi)] 

dXk 

P{Uk  =  l\Xk)dXkP7ri  (24) 


Thus,  using  sensor  1  as  an  illustration,  the  average  error 
probability  Pe  can  be  written  as  the  form  in  Eq.  (25),  from  Eqs. 
(14),  (23),  and  (24).  As  Pi  is  independent  of  the  quantizer  rule 
at  sensor  1,  we  need  only  to  minimize  the  first  term  in  Eq.  (25) 
with  respect  to  the  local  decision  rule  for  sensor  1.  Thus,  the 
optimum  local  decision  rule  for  sensor  1  is  as  follows. 


if  Pi  >  0 
otherwise 


(26) 


This  is  equivalent  to  the  decision  rule  specified  by  Eqs.  (15- 
16)  for  /c  =  1.  The  optimum  quantizer  rule  for  sensor  2  can 
be  similarly  established.  This  completes  the  proof  of  Theorem 

2. 
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Cooperative  Relay  Design 
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Abstract—  In  wireless  networks,  user  cooperation  has  been  pro¬ 
posed  to  mitigate  the  effect  of  multipath  fading  channels.  Recog¬ 
nizing  the  connection  between  cooperative  relay  with  finite  alphabet 
sources  and  the  distributed  detection  problem,  we  design  relay  sig¬ 
naling  via  channel  aware  distributed  detection  theory.  Focusing  on 
a  wireless  relay  network  composed  of  a  single  source-destination 
pair  with  L  relay  nodes,  we  derive  the  necessary  conditions  for  op¬ 
timal  relay  signaling  that  minimizes  the  error  probability  at  the 
destination  node.  The  derived  conditions  are  person-by -person  op¬ 
timal:  each  local  relay  rule  is  optimized  by  assuming  fixed  relay 
rules  at  all  other  relay  nodes  and  fixed  decoding  rule  at  the  des¬ 
tination  node.  An  iterative  algorithm  is  proposed  for  finding  a  set 
of  relay  signaling  approaches  that  are  simultaneously  person-by- 
person  optimal.  Numerical  examples  indicate  that  the  proposed 
scheme  provides  performance  improvement  over  the  two  existing 
cooperative  relay  strategies,  namely  amplify-forward  and  decode¬ 
forward. 

Index  Terms—  Cooperative  relay,  decentralized  detection,  finite 
alphabet,  wireless  relay  network. 


L  Introduction 

IN  wireless  networks,  a  severe  limiting  factor  is  multi¬ 
path-induced  channel  fading.  One  of  the  most  effective 
methods  in  mitigating  fading  is  to  exploit  diversity.  Examples 
include  spatial  diversity  when  multiple  antennas  are  used  at  the 
transceivers,  multipath  diversity  in  frequency- selective  chan¬ 
nels,  and  temporal  diversity  in  time- selective  fading  channels 
through  the  use  of  coding/interleaving.  More  recently,  a  new 
diversity  resource  has  attracted  considerable  attention,  espe¬ 
cially  in  the  context  of  wireless  ad  hoc  networks  [l]-[3].  There, 
multiple  nodes  collaborate  in  transmitting  their  information, 
thus  providing  diversity  by  exploiting  the  independence  of  the 
fading  channels  of  different  users.  This  is  generally  referred  to 
as  the  cooperative  diversity,  and  the  collection  of  cooperating 
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nodes,  including  the  source  and  the  destination  nodes,  are 
referred  to  as  a  relay  network. 

Historically,  study  of  relay  networks  has  focused  on  the 
capacity  issue,  e.g.,  achievable  rates.  The  classical  three-node 
relay  network  was  first  introduced  by  van  der  Meulen  [4],  and 
its  capacity  was  extensively  studied  by  Cover  and  El  Gamal 
[5].  Gastpar  and  Vetterli  [6]  considered  the  capacity  of  wireless 
networks  with  multiple  relay  nodes  and  showed  that  the  lower 
and  upper  bounds  became  the  same  asymptotically  as  the 
number  of  nodes  in  the  network  goes  to  infinity.  Sendonaris  et 
al.  [1],  [2]  were  the  first  to  introduce  the  concept  of  user  coop¬ 
eration  diversity  where  the  mobile  users  shared  their  antennas 
and  other  resources  to  obtain  diversity  gain  through  distributed 
transmission.  Eocusing  on  a  two-user  case,  it  was  shown  that 
user  cooperation  results  in  an  increase  in  capacity  for  both 
users.  In  addition,  the  achievable  rates  are  less  susceptible  to 
channel  variations,  making  the  cooperative  network  a  more  ro¬ 
bust  system.  Kramer  et  al.  considered  several  coding  strategies 
for  various  relay  networks  in  [7]  and  showed  that  a  strategy  that 
mixes  decode-forward  and  compress-forward  achieves  capacity 
if  the  terminals  form  two  closely- spaced  clusters. 

The  performance  of  wireless  relay  networks  has  also  been 
evaluated  by  diversity  gain  and  outage  probability.  By  con¬ 
straining  the  nodes  to  half-duplex  mode,  Laneman  et  al.  [3] 
developed  various  cooperative  transmission  protocols  and 
showed  that  most  of  the  protocols  achieve  full  diversity  order 
(equal  to  the  number  of  cooperative  nodes).  Space-time 
code-based  cooperative  transmission  protocols  were  developed 
in  [8]  and  were  also  shown  to  achieve  full  diversity.  In  [9] 
and  [10],  symbol  error  probabilities  were  derived  in  the  high 
signal-to-noise  ratio  (SNR)  regime  for  the  general  multihop, 
multibranch  wireless  relay  model  using  the  amplify-forward 
(AE)  scheme;  the  result  provides  insight  on  the  optimum 
placement  of  relay  nodes.  Chen  and  Laneman  [11]  focused 
on  the  decode-forward  (DE)  scheme  and  developed  a  general 
framework  for  maximum-likelihood  (ML)  demodulation  in 
cooperative  wireless  communication  systems. 

In  this  paper,  we  focus  on  a  relay  network  consisting  of  a 
single  source-destination  pair  and  L  relay  nodes.  As  illustrated 
in  Eig.  1,  each  relay  node  receives  the  signal  from  the  source 
node  and  generates  a  processed  signal  based  on  its  received 
signal.  The  processed  signals  from  all  the  relay  nodes  are  sent 
to  the  destination  node  using  orthogonal  channels.  The  desti¬ 
nation  node  uses  the  relay  signals  along  with  the  signal  sent 
directly  from  the  source  node  to  determine  the  source  signal. 
Novel  in  the  current  work  is  the  attempt  to  find  channel-aware 
processing  that  minimizes  the  error  probability  at  the  destina¬ 
tion  node.  The  proposed  design  approach  exploits  the  finite-al¬ 
phabet  (EA)  property  of  the  source  message,  thereby  enabling 
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Source  Destination 


Fig.  I .  Wireless  relay  network  with  L  relay  nodes  and  a  direct  link  connecting 
the  source  and  the  destination  nodes. 


US  to  pose  the  cooperative  relay  design  as  a  distributed  multiple 
hypotheses  testing  problem.  Notice  that  this  FA  property  is  ubiq¬ 
uitous  in  almost  all  wireless  systems.  A  similar  idea  has  been  ex¬ 
plored  in  [12]  to  study  a  diversity  combining  scheme  using  the 
quantized  outputs  from  multiple  antennas  with  independently 
faded  binary  frequency- shift  keying  (BFSK)  signals.  Distinc¬ 
tive  in  the  current  paper,  in  addition  to  considering  a  general  FA 
source  instead  of  BFSK,  is  that  the  relay  outputs  are  assumed 
to  also  go  through  general  non-ideal  channels.  Our  approach  is 
to  generalize  the  channel  aware  distributed  signaling  design  for 
binary  hypothesis  testing  problem  [13],  [14]  to  this  cooperative 
relay  problem  and  derive  a  numerical  procedure  to  compute  the 
optimal  local  relay  rules  for  minimum  error  probability  at  the 
fusion  center. 

While  DF  also  utilizes  the  FA  property,  the  proposed  ap¬ 
proach  is  based  optimum  detection  theory  and  thus  provides  su¬ 
perior  error  probability  performance.  To  motivate  our  proposed 
idea,  we  consider  a  simple  relay  network  with  one  source-des¬ 
tination  pair  and  two  relay  nodes.  We  also  assume  a  parallel 
relay  scheme  where  there  is  no  direct  transmission  between 
the  source  node  and  the  destination  node.  The  source  is  binary 
with  repetition  coding,  i.e.,  one  transmits  “-hi  -h  1  -h  1  -h  1” 
or“— 1  —  1  —  1  —  1”,  where  the  redundancy  is  used  to  combat 
channel  impairment.  We  also  restrict  each  relay  node  to  send  a 
four-bit  sequence  to  the  destination  node.  If  we  adopt  a  DF  idea, 
each  relay  node  attempts  to  recover  the  original  binary  source 
and  resends  it  to  the  destination  node.  However,  for  this  simple 
example,  it  will  be  seen  that  the  optimum  relay  rule  amounts 
to  quantizing  the  local  likelihood  ratio;  and  better  performance 
may  result  if  one  uses  all  possible  output  alphabet  at  the  relay  for 
the  quantization.  Contrasting  this  to  the  DF  approach,  one  can 
consider  our  approach  as  using  “soft”  information  from  the  re¬ 
lays  as  opposed  to  hard  decisions  for  DF.  As  such,  applying  the 
distributed  detection  theory  allows  us  to  fully  exploit  the  redun¬ 
dancy  in  the  FA  sources  for  improved  detection  performance. 

Even  without  the  redundancy  in  the  FA  sources,  jointly  de¬ 
signing  the  relay  and  destination  signaling  can  still  result  in  im¬ 
proved  performance  compared  with  DF.  Consider,  for  example, 
a  simple  case  in  which  the  source  signals  are  either  “-hi”  or 
“— 1”.  The  relay  nodes  are  also  restricted  to  transmit  a  binary 
(“-hi”  or  “—1”)  signal  to  the  destination  node.  Assume  that  the 
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channels  between  the  source  and  the  two  relay  nodes  have  iden¬ 
tical  channel  SNRs,  while  the  SNRs  of  the  channels  between 
the  two  relays  and  the  destination  differ  significantly  from  each 
other.  One  natural  question  is:  How  do  we  jointly  determine  the 
relay  and  destination  processing/signaling  that  may  minimize 
the  error  probability  at  the  destination  node?  Clearly,  if  one  re¬ 
sorts  to  the  DF  idea,  each  relay  will  try  to  recover  the  original 
signal  and  retransmit  it  to  the  destination  node.  As  such,  one  can 
immediately  conclude  that  this  idea  leads  to  identical  relay  rules 
at  the  two  relay  nodes.  On  the  other  hand,  as  the  channels  be¬ 
tween  the  relays  and  the  destination  have  different  SNRs,  should 
one  design  the  processing/signaling  differently  for  better  perfor¬ 
mance?  As  demonstrated  in  Section  IV,  the  optimum  relaying 
for  minimum  error  probability  indeed  uses  different  signaling  at 
the  two  relays.  Our  goal  is  to  come  up  with  a  mechanism  to  find 
out  the  optimal  relay  signaling. 

The  proposed  cooperative  relay  signaling  design  assumes  a 
clairvoyant  case,  i.e.,  the  designer  knows  the  global  channel 
state  information  (CSI).  While  this  is  unrealistic,  it  provides  im¬ 
portant  benchmark  performance  and  reveals  a  significant  gap  in 
terms  of  error  probability  performance  between  what  is  achiev¬ 
able  with  the  existing  schemes  and  what  is  achievable  theoret¬ 
ically.  More  important,  the  insight  one  draws  from  this  clair¬ 
voyant  case  study  may  prove  critical  in  devising  cooperative 
signaling  scheme  under  a  more  realistic  setting  with  only  dis¬ 
tributed  CSI  knowledge  (i.e.,  each  relay  node  knows  only  its 
own  CSI). 

The  rest  of  the  paper  is  organized  as  follows:  Section  II  de¬ 
scribes  the  system  model  and  the  problem  formulation.  The 
problem  setting  allows  us  to  derive,  in  Section  III,  the  neces¬ 
sary  conditions  for  optimal  cooperative  relay  strategies  at  relay 
nodes  to  minimize  the  error  probability  at  the  destination  node. 
In  the  same  section,  we  also  consider  several  special  models  and 
including  the  three-node  relay  network,  the  parallel  relay  model, 
and  the  singular  relay  network.  Numerical  examples  are  pre¬ 
sented  in  Section  IV  to  show  the  substantial  performance  gain 
of  our  approach  over  two  existing  relay  strategies.  We  conclude 
in  Section  V. 

II.  Statement  oe  the  Problem 

Consider  a  wireless  relay  network  which  includes  one  source 
node,  L  relay  nodes,  and  one  destination  node  (Fig.  1).  The 
data  transmission  is  divided  into  two  steps.  In  the  first  step, 
the  source  node  broadcasts  a  signal  S  to  all  the  relay  nodes 
as  well  as  the  destination  node.  In  the  second  step,  the  relay 
nodes  then  transmit  the  relay  signals  to  the  destination  node  in 
orthogonal  channels.  We  assume  that  S  is  drawn  from  an  FA  set 
S  =  {^0, . . . ,  sm-i]  with  prior  probabilities  {tto,  . . . ,  ttm-i}* 
Further,  the  received  signals  Ai , . . . ,  Xl  at  the  relays  and  the 
received  signal  Z  at  the  destination,  which  describe  the  broad¬ 
cast  channel  during  the  first  step,  are  characterized  by 

L 

p{Xu. . . ,  Xi,  Z\S)  =  p{z\s)  n  (1) 

1=1 

i.e.,  Xl  and  Z  are  conditionally  independent  given  S.  Here,  the 

transmitted  signal  S  can  be  a  vector,  and  the  received  signal  Xi 

and  Z  would  have  a  similar  structure.  The  Ith  relay  node  sends 
Zo 
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a  relay  signal  Ui  to  the  destination  node  based  on  its  received  in  the  Bayesian  sense  amounts  to  the  maximum  a  posteriori 
signal  Xi  probability  (MAP)  decision,  i.e., 


Ui  =  ^i{Xi),  /  =  (2) 

We  assume  that,  without  loss  of  generality,  Ui  belongs  to  a 
FA  set  T  =  {uq^  ni, . . . ,  'Utv-i}.  While  it  may  appear  natural 
to  require  A"  =  M,  as  in  the  case  of  DF,  we  can  accommodate 
N  ^  M  in  the  proposed  scheme.  Indeed,  as  to  be  seen  later, 
allowing  N  M  is  advantageous  as  it  provides  flexibility  in 
the  relay  signaling  design.  We  note  here  that  the  condition  N  / 
M  need  not  necessarily  mean  that  the  source  sequence  and  the 
relay  message  have  different  lengths.  Redundancy  is  typically 
built  into  the  source  sequence  (e.g.,  channel  coding),  while  the 
relay  node  may  exploit  all  possible  alphabets,  as  illustrated  in 
the  example  in  Section  I.  The  relay  outputs  C/i , . . . ,  C//,  are  also 
sent  through  parallel  transmission  channels  characterized  by 

L 

p{Yu. . . ,  YlIC/i,  ...,Ul)  =  X{p{Yi\Ui).  (3) 

1  =  1 

Note  that  all  the  signals,  including  S,  Z,  A/,  1),  and  Ui,  are 
assumed  to  be  vectors. 

Upon  collecting  the  channel  outputs  from  the  relay  nodes, 
y  =  {Yi and  from  the  source  node  Z,  the  destination 
node  makes  a  flnal  decision 

Uo  =  7o(y,  z)  (4) 

where  Uq  G  {soj  •  •  •  >  sm-i}  indicates  which  signal  was  sent 
from  the  source  node. 

An  error  happens  if  Uq  ^  S.  The  goal  is,  therefore,  to  jointly 
design  the  local  relay  schemes  I  =  1, . . . ,  1/  and  the  de¬ 
coding  rule  7o(*)  such  that  the  overall  error  probability  at  the 
destination  node  P{Uo  /  S)  is  minimized.  From  the  distributed 
detection  point  of  view,  this  relay  system  can  be  regarded  as  an 
M-ary  hypotheses  testing  system  with  each  hypothesis  corre¬ 
sponding  to  one  of  the  input  alphabet  symbols,  i.e..  Hi  :  S  =  Si. 
Given  independence  among  the  transmission  channels,  the  sig¬ 
nals  received  at  relay  nodes  are  independent  conditioned  on  the 
input  source,  or  equivalently,  a  given  hypothesis.  Thus,  the  joint 
probability  density  function  (pdf)  of  the  signals  received  at  the 
relays  becomes 

L 

1  =  1 

Similarly,  for  the  signals  received  at  the  destination  node,  the 
joint  preconditioned  on  the  decision  made  at  the  relays  is 

L 

p{Yu. . . ,  Yl,  Z\Ui, . . . ,  UL,Hi)  =  p{Z\Hi)  l[p{Yi\Ui), 

1=1 

i  =  0,...,M  -1.  (6) 


Uo  =  7o(y,  Z)  =  arg  max  Tfipiy,  Z\Hi).  (7) 

Si:ie{0,l,...,M-l} 

Given  a  specifled  set  of  local  relay  strategies  and  the  channel 
characteristics,  this  MAP  decision  rule  can  be  obtained  in  a 
straightforward  manner.  As  such,  in  the  next  section,  we  will 
focus  on  the  local  relay  signaling  design. 

We  close  this  section  with  a  summary  of  the  cooperative  relay 
design  problem. 

1 )  Problem  Statement:  In  a  wireless  relay  network  as  de¬ 
scribed  in  Fig.  1,  given  the  following: 

•  a  FA  source  S  =  {^o, . . . ,  sm-i}  with  prior  probabilities 
{tto,  . . . 

•  the  channels  from  the  source  to  relay  nodes  described  by 
p(A/|*S)for/  = 

•  the  channels  from  the  relay  nodes  to  the  destination  node 
described  by  p{Yi\Ui)  for  /  =  1, . . . ,  L; 

•  the  channel  from  the  source  to  destination  node  described 
hyp{Z\S)- 

•  and  a  decoding  rule  7o(*)  at  the  destination  node; 
design  the  local  relay  rules  7/(-)  for  /  =  1, . . . ,  L  that  minimize 
the  overall  error  probability  at  the  destination  node  Pr(C/o  ^  S). 

III.  Optimal  Local  Relay  Strategies 

This  is  a  joint  optimization  problem.  In  order  to  obtain  a  glob¬ 
ally  optimal  scheme,  we  should  simultaneously  optimize  the 
local  relay  schemes  at  all  the  relay  nodes.  This  joint  optimiza¬ 
tion,  however,  is  not  feasible  due  to  the  distributed  nature  of  the 
problem  [18].  In  this  paper,  we  adopt  a  person-by-person  op¬ 
timal  (PBPO)  approach,  i.e.,  we  optimize  the  local  relay  rule 
7/(*)  for  the  /th  relay  node  given  flxed  relay  rules  at  all  other 
relay  nodes  and  a  flxed  decoding  rule  7o(-)  at  the  destination 
node.  As  such,  the  conditions  obtained  are  necessary,  but  not 
sufficient,  for  optimality.  This  PBPO  approach  has  been  widely 
adopted  in  various  distributed  inference  problems  (see,  e.g.,  [19] 
and  [20]). 

Deflne 

u  =  [Ui,  C/2  . . . ,  UlI  X  =  [Ai,  A2,  . . . ,  A^] 

so  that  the  error  probability  at  the  destination  node  can  be 
written  as 

.  N-l 

Pe  =  l-PD  =  1-  ^  =  Uj\Xl)DljdXl  (8) 

i=0 

where,  for  /  =  1, . . . ,  L,  j  =  0, . . . ,  A  —  1 

M-l 

Dij  =  Y  =  Si\Ui  =  Uj,Hi)p{Xi\Hi)  (9) 

i=0 

and 


We  point  out  here  that  integrating  the  transmission  channels 
into  the  decoding  rule  design  has  been  investigated  before  in 
the  context  of  decision  fusion  in  fading  channels  for  wireless  _  f 
sensor  networks  (WSNs)  [15]-[17].  The  optimal  decoding  rule 


—  ^i\Ul  — 

f  P{Uo  =  Si\y,Z)p{Z\Hi)p{y\Ui  =  Uj,Hi)dydZ.  (10) 
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Equations  (8)-(10)  can  be  obtained  by  expanding  the  error 
probability  with  respect  to  the  relay  rule  7z  (•) .  The  derivation 
is  straightforward  and  follows  the  same  spirit  as  that  in  [13]; 
hence,  we  skip  the  details. 

Thus,  to  minimize  Pg,  or  equivalently  maximize  Pd,  we  set 
P{Ui  =  Uj^\Xi)  =  1,  where  j*  is  the  index  that  maximizes 
Dij{Xi).  Hence,  we  have  Theorem  1. 

Theorem  1:  The  optimal  relay  rule  for  the  /th  relay  node  must 
satisfy 

Ui  =  ji{Xi)  =  axg  max  Dij{Xi)  (11) 

Uj  ijiGjO,!,...,  A"— 1} 

for  Dij{‘)  defined  in  (9). 

The  major  issue  of  Theorem  1  is  to  evaluate  Dij{').  While  it 
is  possible  to  evaluate  it  analytically  for  some  special  cases,  in 
general  it  requires  numerical  evaluation  which  is  fairly  straight¬ 
forward. 

As  expressed  in  (9)  and  (10),  given  the  fixed  local  relay  rules 
of  the  other  relay  nodes  p{y\Ui  =  Uj^  Hi),  and  the  decoding  rule 
at  the  destination  node  P(Po  =  depends 

on  the  local  observations  at  the  /th  relay  node  and  is  a  linear 
combination  of  the  likelihood  function  of  the  local  observations. 
Following  the  definition  of  likelihood  ratio  quantizer  (LRQ)  for 
multiple  hypotheses  testing  [21],  the  optimal  local  relay  rule  as 
described  in  Theorem  1  is  an  LRQ. 

An  important  distinction  between  the  current  work  and  that 
of  [22]  is  that  we  are  considering  an  M-ary  hypotheses  testing 
problem  with  general  input  (e.g.,  vector  input  such  as  a  packet). 
As  such,  one  does  not  have  the  luxury  of  equating  the  local  relay 
rule  to  a  scalar  quantization  problem;  instead,  one  needs  to  quan¬ 
tize  a  (M  —  1) -dimensional  sufficient  statistic  [23].  Thus,  con¬ 
vergence  checking  by  comparing  relay  rules  is  generally  not  vi¬ 
able. 

The  fact  that  we  use  the  PBPO  criterion  implies  that  the  de¬ 
rived  conditions  are  only  necessary  but  not  sufficient  conditions 
for  optimality.  Recognizing  that  the  necessary  conditions  for  the 
relay  function  7/(-)  is  coupled  with  the  decoding  rule,  we  pro¬ 
pose  the  following  iterative  algorithm  to  find  the  relay  and  de¬ 
coding  rules  that  are  at  least  locally  optimum. 


Iterative  algorithm 


1)  Initialize  the  local  relay  strategies  for  each  relay  node 

/  =  1, . . . ,  P  and  set  the  iteration  index  r  =  1. 

2)  Obtain  the  optimal  decoding  rule  7^^^^  using  (7)  for  fixed 
local  relay  rules  7/^’^”^^  /  =  1, . . . ,  P. 

3)  For  each  /,  obtain  the  PBPO  local  relay  rule  of  /*^ 
relay  node  using  (11)  given  the  fixed  local  relay  rules  for 
the  other  relay  nodes  and  fixed  decoding  rule. 

4)  Evaluate  the  error  probability  at  the  destination 

node  given  the  relay  rules  7^^^^  =  , . . . ,  7!^^^ }  and 

(r)  (r  —  li 

decoding  rule  7q  ^ ,  and  compare  it  with  Pe  .  If  the 
difference  is  less  than  a  prescribed  value,  stop.  Otherwise, 
set  r  =  r  -h  1  and  go  to  Step  2). 


be  non-increasing  after  each  step.  Thus,  the  algorithm  always 
converges  as  the  error  probability  is  lower  bounded  by  zero. 

A.  Special  Cases 

The  relay  network  described  in  Fig.  1  is  rather  general;  it  en¬ 
compasses  many  special  cases.  For  example,  setting  P  =  1  re¬ 
duces  it  to  the  classical  three-node  relay  network;  and  the  cor¬ 
responding  optimum  decoding  rule  and  optimal  local  relay  rule 
can  be  obtained  by  letting  P  =  1  in  (7)  and  (11).  While  this 
three-node  network  is  not  materially  different  from  the  gen¬ 
eral  case,  it  does  significantly  reduces  the  computational  com¬ 
plexity.  Since  there  is  a  single  relay  node,  there  is  no  iteration 
among  the  relay  rules.  Instead,  one  only  needs  to  iterate  between 
the  decoding  rule  and  the  relay  rule. 

Another  interesting  case  is  the  parallel  relay  network  where 
there  is  no  direct  transmission  from  the  source  node  to  the  des¬ 
tination  node.  Following  the  same  spirit  of  the  derivation  in 
Section  III,  we  can  easily  get  the  optimal  decoding  rule  and  op¬ 
timal  relay  rule  which  are  similar  to  (7)  and  (11)  except  that  Z 
is  omitted  from  the  expression. 

We  now  consider  the  simplest  possible  relay  system:  there  is 
only  a  single  (P  =  1)  relay  node  and  there  is  no  direct  link 
between  the  source  and  the  destination  node.  Notice  that  this 
simple  model  can  be  considered  as  a  special  case  of  either  the 
three-node  relay  model  or  the  parallel  network.  We  term  this  as 
a  singular  relay  network.  In  the  context  of  channel  optimized 
quantizer  design  for  WSN,  we  have  shown  in  [14]  and  [22]  that 
for  M  =  2  (i.e.,  a  binary  source),  the  optimum  relay  rule  for  a 
singular  relay  network  is  channel-blind;  i.e.,  the  local  relay  rule 
will  remain  unchanged  when  the  relay-destination  channel  char¬ 
acteristics  change.  For  this  special  case,  the  local  relay  rule  is  the 
same  as  that  in  the  case  with  ideal  relay-destination  channel  as 
this  ideal  channel  can  be  treated  as  a  limiting  case  of  the  fading 
channel.  We  show  in  the  following  that  it  is  not  true  for  the  gen¬ 
eral  case  of  M  >2;  that  is,  for  a  singular  relay  network  with  a 
general  FA  source,  the  relay  signaling  should  always  be  channel 
aware. 

By  setting  P  =  1  in  (7)  and  (11)  and  omitting  Z,  we  can 
easily  obtain  the  decoding  rule 

Uo  =  7o(i^)  =  arg  max  'Kip{Y\Hi)  (12) 
and  local  relay  rule 

u  =  y(X)  =  arg  max  Dj(X)  (13) 

where 

M-l 

Dj{X)  =  Y,  ^iPiUo  =  Si\U  =  Uj)p{X\Hi).  (14) 

i=0 


30 


For  each  iteration,  we  optimize  one  rule  given  that  the  other 
rules  are  fixed.  Therefore,  the  error  probability  is  guaranteed  to 


Define 


Zp{X)  =  {X  :  Dj{X)  <  Dt{X)} 
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which  specifies  a  set  such  that  a  lower  probability  of  error  will 
result  when  the  members  of  the  set  are  assigned  to  index  j  in¬ 
stead  of  L  Define 


Fiji  =  P{Uo  =  Si\U  =  Uj)  -  P{Uo  =  Si\U  =  ui)  (15) 


and 


Since 


we  have 


Li{X)  = 


pjxm 

p{x\Hoy 


M-l 


=  0 


M-l 

A-  -Di=J2  ^iPiX\Hi)Piji 

i=0 

M-l  M-l 

=  T^iP{X\Hi)Piji  -  ^op{X\Ho)Piji 

i=l  i=l 

M-l  /  \ 

=  53  i:,v{X\H^)P,ii  iuiX)  -  ^  j  . 

i=l  ^  ^  ^ 

From  (15),  the  change  of  channel  characteristics  may  alter  the 
value  of  Fiji ,  which  will  result  in  a  different  region  for  deciding 
index  j  instead  of  L  In  other  words,  the  optimum  relay  rule 
for  the  singular  relay  network  needs  to  be  channel  aware  when 
M  >  2. 


where  Pg  is  the  power  constraint  which  is  assumed  to  be  the 
same  for  all  the  relay  nodes  as  well  as  the  source  node,  gi/  is 
the  channel  coefficient  and  is  the  variance  of  channel  noise. 
At  the  destination  node,  all  the  schemes  implement  the  MAP 
rule  to  obtain  the  final  decision. 

Throughout  our  simulations,  we  assume  that  the  channels  be¬ 
tween  the  source  and  the  relay  nodes  are  identically  and  in¬ 
dependently  distributed  (i.i.d.)  Rayleigh-fading  channels  with 
average  SNR  denoted  by  SNRgr-  Similarly,  the  channels  be¬ 
tween  the  relay  and  destination  node  are  also  assumed  to  be  i.i.d. 
Rayleigh  fading  channels  with  average  SNR  denoted  by  SNRrd 
(except  for  the  first  example  where  both  relay  nodes  experience 
different  SNRrd).  Notice  that  this  is  a  somewhat  simplifying  as¬ 
sumption:  In  a  homogeneous  environment  where  the  path  loss 
exponent  is  a  constant,  the  above  assumption  amounts  to  re¬ 
quiring  that  the  relay  nodes  are  equidistant  to  the  source  node 
as  well  as  to  the  destination  node.  We  will  vary  one  of  these  two 
SNR  with  the  other  fixed;  this  captures  the  change  in  the  place¬ 
ment  of  the  relay  nodes  in  terms  of  their  distances  to  the  source 
and  to  the  destination  nodes.  The  SNR  for  the  direct  link  be¬ 
tween  the  source  and  the  destination  node  is  denoted  as  SNR^d- 
Further,  all  the  channels  are  assumed  to  be  slow  fading  chan¬ 
nels  so  that  the  channel  coefficients  remain  unchanged  during 
the  transmission  of  one  symbol  or  a  packet. 

The  signal  sent  from  the  source  node  is  assumed  to  be  a  AT-bit 
codeword  drawn  from  a  M-ary  codebook  with  equal  proba¬ 
bility.  Hence,  M  <2^ .  Each  bit  is  assumed  to  use  BPSK  modu¬ 
lation.  We  also  assume  that  the  local  decision  at  each  relay  node 
is  K  bits;  thus,  the  relay  output  has  a  maximum  alphabet  size  of 
N  =  2^. 


IV.  Performance  Evaluation 

In  this  section,  through  a  number  of  numerical  examples,  we 
demonstrate  the  performance  advantage  of  our  approach  over 
some  existing  relay  strategies,  namely  DE  and  AE,  for  the  relay 
network  defined  in  Eig.  1.  Eor  DE,  each  relay  node  makes  its 
own  decision  using  an  MAP  rule,  as  follows: 

C//  =  arg  max  Trip{Xi\Hi),  l=l,...,L  (16) 


A.  Parallel  Relay  Network 

We  first  consider  an  example  that  we  discussed  in  Section  I, 
the  parallel  relay  network  with  K=l,  M  =  N  =  2  and  L  =  2, 
i.e.,  a  single  BPSK  symbol  is  sent  from  the  source  and  is  to 
be  relayed  to  the  destination  node  using  two  relay  nodes.  We 
assume  that  the  BPSK  signal  has  equal  prior  probability,  i.e., 

P(5  = -1)  =  P(*S  = -hi)  =  0.5. 


and  re-encodes  it  and  sends  it  to  the  destination  node.  This  is  dif¬ 
ferent  from  the  relay  signaling  specified  in  Theorem  1,  i.e.,  (11) 
and  (9),  where  all  the  relay  rules  are  coupled  with  each  other. 
We  remark  here  that  the  DP  approach  considered  in  this  paper 
is  the  vanilla  version  discussed  in  [8]  and  [11].  We  assume  that 
the  relay  node  always  forwards  its  best  estimate  to  a  destination 
node. 

Eor  AP,  the  output  of  the  relay  node  is  simply  a  scaled  version 
of  the  received  signal,  i.e.. 


Ui  =  c/A/,  /  =  1, . . .  ,1/ 

where  the  scaling  factor  q  is  determined  so  that  all  schemes 
have  the  same  average  power  constraint.  Eor  fading  channels, 
we  have 

2 

Ps\(^il\^  +  crfi 


We  also  assume  that  SNRgr  is  identical  for  both  relay  nodes  but 
SNRrd  may  be  different.  In  this  case,  the  relay  rule  used  by  DP 
for  the  Ith  relay  node  can  be  easily  obtained  from  (16), 

SST  =  Re{a’^X}  $  0 
-1 

where  ai  is  the  channel  coefficient  for  the  channel  between 
the  source  node  and  the  Ith  relay  node  and  Re{‘}  means  real 
part.  Application  of  Theorem  1  and  our  iterative  algorithm 
show  that  our  approach  also  compares  SST  to  a  threshold,  but 
our  threshold  is  obtained  by  jointly  designing  the  relay  rules 
and  the  decoding  rule,  which  leads  to  performance  gains.  In 
Table  I,  with  identical  SNRgr  for  both  relay  nodes  and  different 
SNRrd  for  each  relay  node,  we  compare  the  thresholds  of  SST 
and  overall  error  probability  between  DP  and  the  proposed 
approach.  As  one  can  see,  the  proposed  approach  has  better 
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TABLE  I 

Comparison  of  Thresholds  of  SST  and  Error  Probability  Between  DE  and  Proposed  Approach  (SNRsr  =  5  dB). 


SNRrd  for  the 
first  relay  node 

SNRrd  for  the 
second  relay  node 

Relay  scheme 

threshold  at  the 
first  relay  node 

threshold  at  the 
second  relay  node 

Pe 

DF 

0 

0 

0.0080 

5dB 

OdB 

Proposed  approach 

0.0005 

-0.0063 

0.0073 

DF 

0 

0 

0.0063 

5dB 

5dB 

Proposed  approach 

-0.0599 

-0.0599 

0.0045 

DF 

0 

0 

0.0060 

5dB 

lOdB 

Proposed  approach 

-0.0786 

-0.0512 

0.0038 

SNH  reJay-destination  channel  SNR^^  (dB) 

.  Error  probability  versus  SNR  of  relay-destination  channel  for  L  =  i 
2,and/T  =  1  (SNR^r  =  5  dB). 


Fig.  3 
M  = 


- - - 1 - 

-5  0  5  1C 

SNR  of  source-relay  channel  (dB) 

Fig.  2.  Error  probability  versus  SNR  of  source-relay  channel  for  L  =  2 ,  M  = 
2,  and  K  =  1  (SNRrd  =  5  dB). 


Proposed  Approach 
Amplify  and  Forward 
Decode  and  Forward 


performance  than  DF  and  the  thresholds  of  SST  are  different 
for  the  different  relays  for  our  approach. 

We  then  consider  a  little  different  case  where  SNRrd  is  iden¬ 
tical  for  both  relay  nodes.  Figs.  2  and  3  plot  the  error  probability 
at  the  destination  node  as  a  function  of  SNRgr  and  SNRrd,  fe- 
spectively.  From  Fig.  2,  where  SNRrd  is  fixed  at  5  dB,  the  pro¬ 
posed  approach  provides  the  best  performance  among  all  three 
relay  schemes.  In  Fig.  3  where  SNRgr  is  fixed  at  5  dB,  the  AF 
outperforms  the  proposed  method  at  high  SNRrd  values.  This  is 
not  surprising  since  the  optimum  performance  is  achieved  with 
centralized  processing,  i.e.,  when  all  local  observations  are  ac¬ 
cessible  by  the  decoder.  With  high  SNRrd,  the  analog  signal 
can  be  received  at  the  destination  almost  noiselessly,  hence  it 
amounts  to  the  centralized  processing.  The  proposed  scheme 
attempts  to  find  the  optimum  relay  scheme  among  all  possible 
iT-bit  quantizers  to  minimizes  the  error  probability  at  the  desti¬ 
nation  node.  The  AF  apparently  does  not  belong  to  the  class  of 
the  iT-bit  quantizers. 

We  next  consider  a  special  case  that  we  also  discussed  in 
Section  I,  the  repetition  coded  binary  source.  This  is  equiva¬ 
lent  to  a  binary  hypotheses  testing  with  soft  (multibit)  output. 
To  alleviate  the  computational  burden,  one  can  approximate  the 


fading  channel  using  a  binary  symmetric  channel  (BSC)  where 
the  crossover  probability  can  be  properly  calculated  using  the 
channel  SNR.  The  BSC  provides  a  reasonable,  albeit  coarse, 
approximation  of  the  fading  channel;  moreover,  one  can  apply 
directly  the  distributed  detection  algorithm  developed  in  122]  to 
find  the  optimal  relay  rules.  We  thus  compare  the  BSC  approxi¬ 
mation  with  our  approach  using  the  actual  fading  channel  model 
and  the  two  existing  relay  strategies  (i.e.,  AF  and  DF).  Consider 
the  system  with  L  =  2  relay  nodes  and  iT  =  4  bit  source  input. 
We  generate  the  error  probability  plots  as  a  function  of  SNRgr 
and  SNRrd,  respectively.  From  Fig.  4  where  we  vary  SNRgr  but 
fix  SNRrd  =  0  dB,  one  can  see  that  the  proposed  approach  pro¬ 
vides  uniformly  better  performance  compared  with  the  other 
alternatives.  Notice  that  all  the  error  probabilities  level  off  as 
SNRgr  increase.  This  is  not  unexpected:  with  large  SNRgr,  the 
channels  between  the  source  and  the  relay  nodes  can  be  consid¬ 
ered  as  ideal.  Thus,  the  error  probability  performance  is  limited 
by  the  finite  and  fixed  SNRrd-  We  also  notice  that  the  BSC  ap¬ 
proximation  provides  a  reasonable  performance  compared  with 
the  proposed  approach. 

Fig.  5  is  the  error  probability  plot  as  a  function  of  SNRrd 
g^with  fixed  SNRgr  =  0  dB.  Again,  one  observes  error  probability 
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Fig.  4.  Error  probability  versus  SNR  of  source-relay  channel  for  L  =  2 ,  M  = 
2,  and  K  =  4  (SNRrd  =  0  dB). 


Fig.  5.  Error  probability  versus  SNR  of  relay-destination  channel  for  L  =  2, 
M  =  2,  and  =  4  (SNRsr  =  0  dB). 


Fig.  6.  Error  probability  versus  SNR  of  source-relay  channel  for  the  case  using 
L  —  2  and  (7,  4)  code  as  source  input  (SNRrd  =  5  dB). 


Fig.  7.  Error  probability  versus  SNR  of  relay-destination  channel  for  the  case 
using  L  —  2  and  (7,  4)  code  as  source  input  (SNRsr  =  5  dB). 


floor  as  SNRrd  increases  due  to  the  fact  that  SNRgr  is  flxed. 
Furthermore,  the  AF  eventually  outperforms  all  other  schemes 
as  SNRrd  gets  large  -  this  is  again  because  at  very  high  channel 
SNR  between  the  relays  and  the  destination,  AF  essentially 
amounts  to  a  centralized  processing.  On  the  other  hand,  the  DF 
is  the  first  to  level  off  in  the  error  probability  performance.  This 
is  because  the  DF  uses  a  hard  decision  relaying — this  is  clearly 
not  optimal  at  high  SNR  for  the  channel  between  the  relays  and 
the  destination. 

We  also  consider  a  more  practical  scenario  where  the  packet 
is  coded  with  a  (7,  4)  Hamming  code  [24]  with  L  =  2  relay 
nodes,  and  the  generator  matrix  we  use  is 
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As  shown  in  Figs.  6  and  7,  the  proposed  approach  again  has  the 
best  performance. 

B.  Three-Node  Relay  Network 

We  compare  the  performance  of  the  proposed  scheme  with 
two  existing  relay  schemes  for  the  classical  three-node  model. 
In  generating  the  error  probability  plots,  we  vary  one  channel 
SNR  and  fix  the  other  two.  As  shown  in  Figs.  8-10,  the  pro¬ 
posed  approach  still  has  the  best  performance.  When  we  vary 
SNRsr  Of  SNRrd,  tho  plots  we  obtain  are  similar  to  previous  ex¬ 
amples:  the  proposed  scheme  is  uniformly  better  than  others  for 
varying  SNRsr  and  the  advantage  of  the  proposed  scheme  over 
DF  diminishes  at  low  SNR  for  varying  SNRrd-  Since  we  have 
a  direct  transmission  from  source  to  destination  node,  when  we 
vary  SNR^d  and  fix  the  other  two,  the  performance  gain  of  the 
proposed  scheme  diminishes  to  zero  at  high  SNR,  as  shown  in 
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Fig.  8.  Error  probability  versus  SNR  of  source-relay  channel  for  classical  Fig.  10.  Error  probability  versus  SNR  of  source-destination  channel  for  clas- 

model  with  M  =  3  and  K  =  3  (SNRrd  =  5  dB,  SNR.^  =  5  dB).  sical  model  with  M  =  3  and  K  =  3  (SNR^r  =  5  dB,  SNRrd  =  5  dB). 


Fig.  9.  Error  probability  versus  SNR  of  relay-destination  channel  for  classical 
model  with  M  =  3  and  K  =  3  (SNRsr  =  5  dB,  SNR,d  =  5  dB). 


V.  Conclusion 

In  this  paper,  a  novel  cooperative  relay  signaling  that  applies 
channel  aware  decentralized  detection  theory  was  proposed  to 
fully  exploit  the  FA  property  of  the  source  message.  Aimed  at 
minimizing  the  error  probability  at  the  destination  node,  we  de¬ 
rived  the  necessary  conditions  for  an  optimal  distributed  sig¬ 
naling  scheme  for  a  FA  source.  An  iterative  algorithm  was  pre¬ 
sented  to  find  distributed  relay  schemes  that  are  at  least  locally 
optimum.  We  further  examined  some  special  cases,  including 
the  classical  three-node  relay  network  and  the  parallel  relay  net¬ 
work.  For  the  special  case  of  a  single  relay  node  with  no  direct 
link  between  the  source  and  the  destination  node,  i.e.,  the  sin¬ 
gular  relay  network,  we  pointed  out  the  significant  difference 
between  a  binary  source  and  a  general  M-ary  source  (M  >  2), 
that  is,  while  the  optimal  relay  rule  is  channel  blind  for  the  sin¬ 
gular  relay  network  with  a  binary  source,  it  is  channel  aware 
when  M  >  2.  Performance  comparison  with  two  existing  relay 


strategies,  namely  AF  and  DF,  was  conducted  numerically.  In 
almost  all  cases  of  practical  interest,  the  proposed  approach  ex¬ 
hibits  notable  advantages  over  existing  relay  schemes  that  do 
not  exploit  the  redundancy  in  FA  sources. 

One  drawback  of  the  proposed  scheme  is  that  the  optimal  sig¬ 
naling  design  requires  global  channel  information.  Distributed 
signaling  design  that  only  uses  local  channel  information  is 
more  practical  and  will  be  investigated  in  the  future.  Similar 
work  has  been  carried  in  the  context  of  distributed  detection  for 
sensor  networks  [25]  and  can  be  extended  to  the  cooperative 
relay  signaling  design.  Another  drawback  is  that  the  relay  rule 
design  of  all  relay  nodes  are  coupled  in  the  proposed  design 
approach.  This  significantly  increases  the  complexity  of  the 
design  algorithm  which  typically  scales  exponentially  in  the 
number  of  nodes.  One  remedy  is  to  resort  to  the  large  system 
regime  to  optimize  the  error  exponent  instead  of  the  error 
probability,  thereby  circumventing  the  iterative  algorithm  that 
is  needed  to  achieve  the  person-by-person  optimality  in  error 
probability  performance. 
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Abstract — We  study  in  this  paper  the  sum  capacity  achievabil- 
ity  of  orthogonal  transmissions  in  vector  Gaussian  multiple  access 
channels  (MAC).  Specifically,  we  derive  sufficient  and  necessary 
conditions,  in  terms  of  channel  matrices  and  transmitter  power 
constraints,  for  orthogonal  transmissions  to  achieve  the  sum 
capacity  of  a  vector  Gaussian  MAC.  The  obtained  conditions 
provide  a  unified  framework  that  helps  explain  many  intuitive 
and  known  results  as  well  as  explore  cases  that  have  not  been 
addressed.  In  the  cases  when  these  conditions  are  violated,  our 
results  enable  us  to  quantify  the  suboptimality  of  orthogonal 
transmission  when  the  sum  capacity  can  only  be  achieved  by 
overlay  transmission. 

Index  Terms — Sum  capacity,  vector  Gaussian  multiple  access 
channel,  frequency  division  multiple  access. 

1.  Introduction 

N  A  MULTIPLE  access  channel  (MAC),  multiple  trans¬ 
mitters  communicate  with  a  single  receiver.  The  capacity 
region  of  a  two-user  MAC  is  the  closure  of  all  (i?i,  i?2)  pairs 
satisfying 

i?i  <  /(Xi;L|X2), 

R2  <  /(X2;L|Xi), 

i?i+i?2  <  /(Xi,X2;L), 

for  some  product  distribution  Pi{xi)p2{x2)  [1],  [2],  where 
^1,^2,3^/are  respectively  transmit  and  receive  alphabets. 
For  a  two-user  scalar  Gaussian  MAC,  the  capacity  region  is 
specified  by 

ji.  <  5l<>6(l  +  f). 

R2  <  5i«e(i  +  f). 

where  Pi  and  P2  are  respectively  the  average  power  constraint 
of  the  two  transmitters,  and  N  is  the  noise  variance  at  the 
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receiver.  While  it  is  shown  that  the  capacity  region  is  achiev¬ 
able  using  overlay  transmission,  it  is  also  well  known  that, 
for  a  scalar  Gaussian  MAC,  orthogonal  transmissions,  e.g., 
frequency  division  multiple  access  (FDMA)  or  time  division 
multiple  access  (TDMA)  under  an  average  power  constraint, 
can  achieve  the  sum  capacity  [2].  As  such,  although  FDMA 
and  TDMA  is  suboptimal  in  terms  of  the  entire  capacity  region 
[3],  if  only  the  system  throughput  is  of  concern,  orthogonal 
transmissions  are  sufficient,  resulting  in  a  much  simplified 
transceiver  structure,  i.e.,  no  successive  interference  cancel¬ 
lation  is  needed.  Similar  result  holds  for  a  scalar  Gaussian 
MAC  with  more  than  two  users. 

With  vector  Gaussian  MAC,  the  above  claim  -  that  orthogo¬ 
nal  transmissions  achieve  the  sum  capacity  -  is  not  necessarily 
true.  Indeed,  it  is  observed  that  in  most  cases  orthogonal 
transmissions  fall  well  short  of  achieving  the  sum  capacity 
of  a  vector  Gaussian  MAC  [4].  The  goal  of  this  study  is  two¬ 
fold.  First,  we  establish  sufficient  and  necessary  conditions  for 
orthogonal  transmissions  to  be  optimal  in  achievable  sum  rate 
for  a  vector  Gaussian  MAC.  The  established  conditions,  in 
terms  of  singular  values  and  singular  vectors  of  the  channel 
matrices  as  well  as  the  power  constraints,  provide  a  unified 
framework  behind  many  intuitive  and  well  known  results.  In 
addition,  it  allows  us  to  examine  cases  that  have  not  been 
explored  before  in  terms  of  the  (sub)optimality  of  orthogonal 
transmissions  for  vector  Gaussian  MAC.  We  show  that  the 
channel  must  have  proportional  singular  values,  well  aligned 
singular  vectors  and  appropriate  power  constraints  in  order  for 
FDMA/TDMA  to  achieve  the  sum  capacity.  Secondly,  using 
the  established  conditions,  we  attempt  to  provide  quantitative 
measure  for  the  performance  degradation  of  orthogonal  trans¬ 
mission  when  they  are  suboptimal. 

The  paper  is  organized  as  follows.  In  Section  II,  we  present 
the  channel  model  and  give  the  main  results,  namely  the 
sufficient  and  necessary  conditions  in  terms  of  channel  ma¬ 
trices  and  power  constraints,  for  FDMA  to  achieve  the  sum 
capacity.  The  equivalence  of  FDMA  and  TDMA  in  terms  of 
achievable  sum  rate  is  also  established  in  section  II.  In  Section 
III,  we  examine  several  cases  using  the  new  framework  to 
determine  the  (sub)optimality  of  FDMA.  In  the  cases  when 
FDMA  becomes  suboptimal,  we  propose  a  heuristic  metric 
to  quantify  the  degradation  of  sum  capacity  achievability  of 
FDMA  in  section  IV.  We  conclude  in  Section  V. 

The  following  notations  will  be  used  throughout  the  paper: 
A^,  I  A|,  tr(A),  ran/c(A),  1 1  A|  |  jkrt  respectively  the  Hermitian 
matrix,  determinant,  trace,  rank,  and  2-norm  of  matrix  A;  x  = 
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1  —  j6c\  (x)+  =  max{x,  0};  diag{ai^  •  •  •  /cr^)  is  a  diagonal  Proof:  Consider  a  f3P^  PP),  the  sum 

matrix  with  the  diagonal  entries  ai,  •  •  •  /a^.  capacity  is 


11.  Main  Results 
Consider  a  vector  Gaussian  MAC 

y  =  Hixi  +  H2X2  +  z, 

where  is  an  x  jht^  full  rank  channel  matrix,  x^  and 
y  are  •  x  /I  transmit  and  x  /I  receive  signal  vectors 
respectively,  z  is  a  x  1  complex  Gaussian  noise  vector,  with 
E{z)  =  (zz^)  =  I,  where  I  is  the  x  identity  matrix. 

The  covariance  matrix  of  x^  is  denoted  by 
with  power  constraint  tr(S^)  <  Pi.  Both  the  transmitters  and 
receivers  are  assumed  to  have  full  channel  state  information. 
For  simplicity,  we  use  H2,  Pi,  P2)  to  denote  this 

vector  Gaussian  MAC,  of  which,  the  sum  capacity  is 

C=  max  log  HiSiHJ +H2S2HJ +I  .  (1) 

tr{Si)<Putr{S2)<P2 

It  was  established  in  [5]  that  the  sufficient  and  necessary 
condition  to  achieve  the  sum  capacity  is  the  mutually  water¬ 
filling  scheme,  i.e.,  choose  as  single-user  water-filling 
covariance  matrix  by  treating  other  user’s  signals  as  channel 
noise.  On  the  other  hand,  the  maximum  achievable  sum  rates 
by  using  FDMA  and  TDMA  are  respectively 

Cp  =  max  Cp  =  max  Cpi^Oif 

0<q:<1  0<q:<1 

where  a  is  the  fraction  of  bandwidth  or  time  allocated  to  the 
first  user,  by  normalizing  the  bandwidth  or  the  time,  we  obtain 
from  [2,  (15.150),(15.151)] 


C  =  max  log  -HSiHt  +  -HS2Ht  + 1 

tr{Si)<(3P,tr{S2)<f3P  OL  a 

max  log  -H(Si  +  S2)Ht  +  I 

tr{Si)<(3P,tr{S2)<(3P  OL 

<  /  max  log  -HSHt  +  I 
tr(S)<P  (Y 

=  log  +  I  . 

a 

The  above  sum  capacity  is  achieved  by  choosing  Si  =  pSopt 
and  S2  =  PSopt-  According  to  Theorem  1  of  [5],  the  two 
matrices  ^USoptH^  and  satisfy  the  mutually 

water-filling  condition.  ■ 

Our  goal  is  to  find  the  sufficient  and  necessary  conditions 
such  that  Cp  =  C.  Our  main  result  is  summarized  below. 

Theorem  1:  For  a  MAC(Hi,  H2,  Pi,  P2),  FDMA  can 
achieve  its  sum  capacity  if  and  only  if  there  exist  0  <  a  <  1, 
Siopt,  and  S2opt  that  jointly  satisfy 

-HiSioptHl  =  iH2S2optHt,  (4) 

a  a 

s  1  opt  =  arg  max  log  -HiSiH^+I  ,  (5) 

tr(Si)<Pi  a 

S2opt  =  arg  max  log  ^H2S2H^  + 1  .  (6) 

tr(S2)<P2  <Y 

Proof:  Sufficient  condition  From  (4)-(6),  by  choosing 
Si  =  Si  opt  and  S2  =  S2opt^  the  achievable  sum  rate  is 


Cp(a)  =  max 

tr(Si)<Pi,tr(S2)<P2 


a  log  -HiSiHj-fl 


log  HiSioptH^  +  H2S2optHT  +  I 


1  t  1 

-H2S2HJ  +  I  \ 

,  (2) 

=  log  — +  I 

(Y 

tr  (Si  )  =  ^,tr  (82)  =  ^ 


I  a  log 


+alog  H2S2HI 


}.  (3) 


Both  Cp{a)  and  Cp{a)  are  obtained  by  two  independent 
single  user  water-fillings  in  their  respective  channels.  We  will 
show  later  that  Cp{a)  =  Cp{a)  for  all  a  for  a  given  MAC, 
therefore  we  can  explore  the  achievability  of  sum  capacity 
focusing  only  on  FDMA.  In  addition,  we  have  Proposition 
1,  whose  proof  is  nearly  identical  to  the  proof  of  concavity 
of  FDMA  sum  rate  for  the  vector  Gaussian  interference 
channel  [6,  (14)],  as  the  two  sum  rates  bear  exactly  the  same 
expression. 

Proposition  1:  Cp{a)  is  a  concave  function  of  a. 

Proposition  1  guarantees  convergence  of  simple  gradient 
methods  to  the  global  maximum  [7]. 

Before  proceeding,  we  first  show  the  following  lemma. 

Lemma  1:  If  Sopt  =  argmax^^(S)<p  log  |  +  l|, 

where  a  >  0  is  a  constant,  then  for  any  P  g/(0,  1),  the 
two  matrices  and  satisfy  the  mutually 

water-filling  condition. 


=  max  log  — HiSiHI  +I  . 
tr(Si)<Pi  a 

From  Lemma  1,  HiSioptHj  and  (or 

H2  S2optH2)  satisfy  the  mutually  water-filling  condition. 
From  Theorem  1  of  [5],  they  achieve  the  sum  capacity. 

Apply  FDMA  to  the  same  channel  with  Si  =  Siopt,  ^2  = 
^2opt  and  the  bandwidth  allocation  factor  a,  the  sum  rate  is 


Cp  =  (Ylog 


=  a  log 


HiSioptHj 


I  T  1-1  H2S2optH2 

+  I  +  (Y  log  - ^  -b  I 


HiSioptHj 


=  log  -HiSioptH  +I  =C, 
a 

i.e.,  it  achieves  the  sum  capacity. 

Necessary  condition  Assume  FDMA  can  achieve  the  sum 
capacity  with  (Y,  Si^pt,  S2opt.  We  only  need  to  show  that  (4) 
.must  be  satisfied  since  (5)  and  (6)  must  be  true.  From  the 
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assumption,  we  have 


where  vi  and  V2  are  the  water-filling  level 


C 

=  a  log 


HiSioptH 


I  +  a  log 


(a) 

</iog  a 

<C, 


HiSioptHj 


a 


a 


H2S2optH2 

a 

a 


ri 

E' 

r2 

E 


,a 


-  Pi 


V2  - 


=  P2. 


(14) 


(15) 


Assume  the  power  is  allocated  until  the  m\^  and  eigen- 
modes  for  user  1  and  user  2  respectively.  Then  mi  =  m2  =  m, 


where  (a)  follows  from  the  concavity  of  log  |  •  |  for  positive 
semi-definite  matrices  [2],  with  equality  if  and  only  if  (4)  is 
true.  Since  equality  must  hold,  (4)  must  be  true.  ■ 

Conditions  (5)  and  (6)  can  be  interpreted  as  that  Siopt  water- 
fills  for  the  given  a.  To  be  able  to  dissect  more  complicated 
cases,  we  next  present  Theorem  2,  derived  directly  from  The¬ 
orem  1 .  We  assume  that  the  channel  matrices  admit  respective 
singular  value  decompositions  i  =  1,2. 

We  denote  by  aij  and  Uij  the  singular  value  and  left 
singular  vector  for  H^.  Without  loss  of  generality,  we  assume 


because 


and 


are  strictly  greater  than  1 .  Thus 


vi  = 


m 


Pi+aJ2' 


i=l 


V2  = 


P2  +  aJ2'- 


i=l 


^2i 


(16) 


(17) 


=  rankCHi),  7  =  1,2. 


Theorem  2:  For  a  MAC  (Hi ,  H2 ,  Pi ,  P2 ) ,  FDMA  achieves 
the  sum  capacity  if  and  only  if  there  exists  an  integer  1  <m  </ 
min{ri, r2}/(hat  satisfies  the  following  conditions. 

Singular  value  conditions  For  some  constant  k, 

.^2 

=  k 


In  (13),  the  two  matrices  on  both  sides  must  have  the 
same  eigenvalues  and  the  same  corresponding  eigenvector 
subspaces,  therefore,  must  satisfy  the  singular  vector 
conditions  (8)  and  singular  value  conditions 


^11 


"21 


^  Im 


^2m 


aV2 

avi 


^k. 


(18) 


'Im 


cr; 


21 


cr; 


(7) 


Substitute  (17)  into  (18)  we  have  (12)  and  (9).  In  order  that 
power  is  allocated  until  the  m^^  element  we  must  have-^  < 


2m 


Singular  vector  conditions  For  any  cri^^_i  ^  cjin-^  = 
C^lni+l  =  •  •  •  =  (Jin^  ^  Crin2  +  1  with  1  <  m  <  772  <  m, 

5{uini,  •  •  •  /Uin2}  =  5{u2ni,  •  *  *  /U2n2},  (8) 

where  5{ui,  •  •  •  /ul} /denotes  the  subspace  spanned  by  ui, 
•  •  • ,  Ml-  In  the  event  that  all  singular  values  are  distinct,  we 
have  ui^  =  ±U2i  for  1  <  7  <  777. 

Power  constraint  conditions 


Im+l 


and 


<  V2 


Equations  (7)  and  (8)  establish  that  the  two  channel  matrices 
must  have  proportional  singular  values  and  perfectly  aligned 
singular  vectors,  while  the  last  condition  dictates  that  the  cor¬ 
responding  power  constraints  must  be  such  that  the  respective 
water-filling  uses  the  same  number  of  eigenmodes  for  the  two 
users  in  the  FDMA  transmission  for  the  optimal  a. 

For  a  MAC  with  2^  ^ 


,  even  if  the  singular  vector 

^2m  '  ^2,m+l 

conditions  in  Theorem  2  are  satisfied,  if  Pi  >  —  — 


V1P2  =  V2P1, 


^+1 


where 


ri 

E 

i=l 

r2 

E 

i=l 


Vi 


V2 


Mi 


,a 


Mi 


m 

E 

i=l 

m 


Vi 


Mi 


=  Pi,  (10) 


(9)  Yl^=i  ^  either  7  =  1  or  2,  the  power  conditions  are 
violated  and  FDMA  is  suboptimal  due  to  the  generous  power 
constraint,  which  favors  overlay  transmission  with  successive 
interference  cancellation.  Proposition  2  shows  the  relation  of 
power  conditions  and  the  achievability  of  sum  capacity  of 
FDMA. 

Proposition  2:  For  a  MAC(Hi,H2, 


11  _ 

“  — 

21 


a  = 


kPi 


■•A 


and  the  singular  vector  conditions  in 


kPi  +  P2 


(12) 


Proof:  From  (4)  we  have 


-^2m  ^2,m+l 

Theorem  2  are  satisfied,  FDMA  achieves  the  sum  capacity  if 
and  only  if  the  power  constraint  pair  (Pi,P2)  belongs  to  P, 
where 


1 


UiSiSiS'iUl  + 1  =  4u2S2S2S2U^  +  I, 


VM{{Pi.P2)\kPi+P2<mk 

(  1 


where  Si  =  VjSioptVi 
we  have 


and  S2  =  VTS2optV2.  With  (5)-(6) 


.2  iL^2 

l,m+l5  ^^2,m+l 


-A 


<7"^!  m 


,Pi,P2  >0  ^,(19) 


Ui  diag 


a 


where 


is  the  harmonic  mean  of  i 


a 


il  5  ‘ 


/I 


t 


U2  diag 


al^V2 


2 

/^2m2 


V2 


,!,■■■  A 


M,(13) 


^  m 
A  _  ^1 
777  ^ 


i=l 


1 

4' 
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Proof:  If  the  power  is  allocated  up  to  the  eigenmode 
for  both  users,  from  Theorem  2  (10)  and  (11),  Cp  =  C  if 

a  ,  a 


(j 


<vi  </- 


a 


<V2  </- 


l,m+l 

a 


^2,m  ^2,m+l 

Substitute  (10)-(12)  to  (20)  and  (21)  we  have 


(20) 

(21) 


mk  - —  <  kPi  +  P2 

<mk  - 2^,2 - .(22) 

Therefore,  for  all  the  power  constraint  pairs  (Pi ,  P2)  satisfying 
(22),  FDMA  achieves  the  sum  capacity.  Notice  that  it  is  not 
necessary  that  the  power  must  be  allocated  up  to  the 
eigenmode  for  FDMA  to  be  optimal.  If  both  users  allocate  the 
power  to  the  eigenmode  and  the  power  constraint 

conditions  in  Theorem  1  are  also  satisfied,  FDMA  can  still 
achieve  the  sum  capacity.  Consider  the  same  constraints  in 
(20)  and  (21)  with  m  replaced  by  t,  we  have  (23),  where 
t  =  1,  •  •  •  /m  —  1.  Denote  the  sets  defined  in  (23)  as  Vi^i  = 
1,  •  •  •  /m  —  /f  and  the  set  in  (22)  as  P^.  For  the  MAC,  if 
the  power  constraint  pairs  satisfy  (Pi,P2)  G  P,  FDMA  can 
achieve  the  sum  capacity,  where  P/=  Ulii  which  is  the 
same  as  (19).  In  (19)  we  exclude  the  trivial  cases  that  FDMA 
always  achieves  sum  capacity  if  either  Pi  or  P2  is  zero,  since 
it  reduces  to  a  single  user  channel.  ■ 

In  the  following,  we  establish  the  equivalence  of  FDMA 
and  TDM  A  in  terms  of  achievable  sum  rate. 

Proposition  3:  For  a  MAC(Hi,  H2,  Pi,  P2),  Cpioi)  = 
Ct^ol)  for  all  0  <  (Y  <  1 

Proof:  Define  Si  =  ^  and  S2  =  and  substitute  them 
into  (2),  it  can  be  shown  that  (3)  and  (2)  are  equivalent.  ■ 
Therefore  all  the  results  of  FDMA  can  be  readily  extended 
to  TDMA. 

Finally,  we  extend  Theorem  1  to  the  multiple-user  MAC. 
The  proof  follows  exactly  the  same  proof  of  Theorem  1 ,  and 
is  omitted  because  of  the  space  limit. 

Theorem  3:  For  a  k-uscr  MAC(Hi, . . . ,  Pi, . . . ,  P/^), 
FMDA  achieves  the  sum  capacity  if  and  only  if  there  exist 
0  <  (Yi  <  1,  ^=1  ^ioptP  =  1,...,/^^  that  jointly 

satisfy 

ThiSiop^hI  =  ■  ■  ■  =  (24) 

ai  ttk 

Siopt  =  arg  max  log  — HjSjH|  +  I  (25) 

tr{Si)<Pi  Oii 


III.  Cases  for  FDMA  being  sum-capacity  optimal 

The  sufficient  and  necessary  conditions  in  Theorem  1  or 
2  appear  to  be  overly  restrictive.  Such  conditions  are  rarely 
satisfied  for  the  general  vector  Gaussian  MAC.  The  results, 
however,  provide  a  unified  approach  to  determine  the  sum 
capacity  optimality  of  orthogonal  transmissions.  More  impor¬ 
tantly,  Theorem  2  also  allows  us  to  gain  insight  into  how 
to  quantify  the  suboptimality  of  orthogonal  transmissions  as 
demonstrated  later  in  this  section. 


o 
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E 

13 
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Example  2 


Example  3 


- Sum  capacity  C 

- FDMA  sum  rate  Cp(a) 


0  0.2  0.4  0.6  0.8  1 

Frequency  band  allocation  a 


Fig.  1.  Sum  rate  of  FDMA  versus  frequency  allocation  factor,  where,  for 
Example  2  Ur  =  nt^  =  nt2  =  8,  Pi  =  P2  =  1,  7  =  2,  H  and  A  arcN 
randomly^hosen;  for  Example  3,  rir  =  nt^  =  8,  nt2  =  10,  cri  =  \/5; 
(72  =  VT5,  Pi  =  P2  =  1,  Ui,  U2,  Vi  and  V2  are  randomly  chosen. 


In  the  following,  we  demonstrate  the  utility  of  the  proposed 
sufficient  and  necessary  conditions  by  reviewing  some  exam¬ 
ples,  in  which  FDMA  is  sum  capacity  optimal. 

Example  1:  Ur  =  1,  >  1. 

In  this  case,  both  channels  have  only  one  singular  value 

<71  =  IIHill,  CT2  =  IIH2II,  m  =  1,  k  =  and 

Ui  =  U2  =  1.  From  (10)  and  (11)  we  obtain  Vi  = 

Pi  (^1  +  iiHiii^^PifiiH^II^Pa)  7  =  1,2.  Therefore  all  the  con- 
ditions  in  Theorem  2  hold  and  Cp  =  C. 

Example  2:  Hi  =  7H2A,  7  is  a  constant  and  AA^  =  I. 
Define  rit  =  rip  =  nt2  and  r  =  ri  =  r2,  we  have  Ui  =  U2 
and  k  =  =  1,  •  •  •  /r,  with  these  and  (10)  and  (11)  we 

have 

77-/ _ 2^ _ 

fr;  Vf"!  Vii  (7V1  +  P2) 

\P2  'cJli  (7'A+P2) 

=  1. 

Then  ^  Depending  on  Pi,P2,  ip  can  be  any  integer 

between  1  and  r.  Therefore,  all  the  conditions  of  Theorem 
2  are  satisfied  and  Cp  =  C.  Intuitively,  as  A  is  a  unitary 
matrix,  one  can  apply  capacity-preserving  precoding  A  to  X2, 
resulting  in 

y  =  Hixi  +  H2AX2  +  z  =  H2A  (7x1  +  X2) , 

i.e.,  effectively  reducing  the  MAC  channel  to  a  single  user 
channel,  where  H2A  is  the  channel  matrix,  and  7x1  and  X2 
are  two  independent  signals  transmitted  by  this  single  user. 
Therefore  FDMA  achieves  the  sum  capacity. 

Example  3:  Hi  and  H2  have  identi¬ 

cal  singular  values  (jp  —  Gi,  i  —  1,  2;  j  =  1,  2,  •  •  •  /n^. 

In  this  case,  k  =  m  =  and  5{uii,  •  •  •  /uinA /= 
5{u2i,  •  •  •  /u2nr}  =  using  the  same  argument  in  Exam¬ 
ple  2  one  can  show  f;iP2  =  l’2Pi-  Therefore  all  the  conditions 
in  Theorem  2  are  satisfied  and  Cp  =  C. 

Two  special  cases  of  Examples  2  and  3  are  shown  in  Eig.l 
with  Cp  =  C  when  a  =  0.8.  The  example  in  [5,  page  148] 
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kt 


1 


0-^1, t 


<C  kPi  P  P2  ^  /vt 


(23) 


is  also  achievable  by  FDMA  with  a  =  0.5.  In  the  above 
examples,  the  channel  matrices  make  the  power  constraint 
automatically  satisfied  regardless  of  the  values  of  Pi  and  P2, 
i.e.,  the  water-filling  level  Vi  is  always  proportional  to  Pi. 
However,  there  are  cases  that,  even  if  the  channel  matrices 
satisfy  the  singular  value/vector  constraints,  one  still  need  the 
right  Pi  and  P2  as  Proposition  2.  Here  is  an  example. 

Example  4:  For  a  MAC  with  Hi  =  diag  and 

H2  =  diag  we  show  in  Fig.2  the  relation  of 

^  v.s.  (Pi,P2).  It  can  be  shown  from  Proposition  2  that 
Cf  =  C  when  2Pi  +  P2  <  /2  and 

Pi  =  {(Pi,P2)|2Pi+P2  </l,Pi  >0,P2  >0}/ 

P2  =  {(Pi,P2)|4<2Pi+P2  </2,Pi  >0,P2  >0}/ 


Fig.  2.  as  a  function  of  Pi ,  P2  for  Example  4. 


In  Fig.2,  Pi  denotes  the  area  between  the  axis  and  the  line 
segment  EF,  in  which,  Cf  =  C  by  allocating  the  power  only 
to  the  first  eigenmode,  and  P2  denotes  the  area  between  the 
line  segment  EF  and  MN,  in  which,  Cf  =  C  by  allocating 
the  power  up  to  the  second  eigenmode. 

We  will  show  in  the  following  example  that  although  FDMA 
is  suboptimal  in  a  fading  MAC,  the  ergodic  sum  capacity 
can  be  asymptotically  achieved  when  the  number  of  antennas 
becomes  large. 

Example  5:  The  ergodic  sum  capacity  of  a  fading  MAC  is 


Ce  =  E  log  — HiHI  +  — H2HI  +  I  , 

nt^  nt^ 

where,  all  entries  of  Hi  and  H2  are  assumed  to  be  zero  mean 
independent  complex  Gaussian  with  unit  variance.  From  [8] 
and  [9]  the  ergodic  sum  capacity  can  be  achieved  by  choosing 


Si 


xnt^  5 


S2 


nt2 


I 


For  the  same  reason  the  ergodic  maximum  sum  rate  of 
FDMA  is 


p 

Cfc  =  max  E  a  log  — ^HiH!  +I 
o<a<i  arit^ 

+alog  -^H2H|  +  I 

aut^ 

For  each  realization  of  Hi  and  H2,  with  probability  1,  the 
condition  of  theorem  1  can  not  be  satisfied.  But  when  and 
77^2  increase  while  is  fixed,  from  Law  of  Large  Number, 

— HiHl  — H2Ht 

nt2 

So  we  have 

lim  Ce  =  Ur  log(Pi  -I-  P2  +  1), 
lim  Cfc  =  max  n^alog  (  —  + 1 

nti,nt2^oo  0<a<l  ya 


Prira  log 


40 


where  the  maximum  is  achieved  when  a  =  • 

Therefore,  in  fading  MAC,  with  increasing  number  of 
transmit  antennas,  ergodic  sum  capacity  can  be  asymptotically 
achieved  by  FDMA  and  the  bandwidth  allocation  factor  is 
proportional  to  the  power  of  the  corresponding  user.  This 
result  can  also  be  explained  by  Theorem  2.  Example  5  can 
be  considered  as  a  counterpart  of  Example  3,  since  all  the 
singular  values  of  Hi  and  H2  asymptotically  become  identical 
respectively  when  ,  nt2  become  large  while  is  fixed. 
In  both  the  fading  and  non-fading  cases,  the  optimal  power 
allocation  for  EDMA  is  to  evenly  distribute  the  power  among 
all  transmit  antennas.  However,  k  is  dropped  in  the  expression 
of  the  optimal  a  for  the  fading  case  because  the  singular  values 
converge  to  while  the  allocated  power  is  pp 

and 

nt2 

Intuitively,  the  asymptotic  achievability  of  ergodic  sum 
capacity  by  EDMA  can  be  seen  from  the  degrees  of  freedom’s 
point  of  view.  Eor  a  Gaussian  vector  MAC,  the  degrees  of 
freedom  is  -hnt2,  if  overlay  transmission  is  used, 

and  is  min(nti , 77^2, 77r)  if  EDMA  is  used.  So  when  both 
77ti  >  Ur  and  77^2  >  rir,  the  total  degree  of  freedoms  is  not 
decreased  by  orthogonal  transmission,  which  makes  it  feasible 
for  EDMA  to  achieve  the  sum  capacity. 

In  Eigure  3,  as  rif  becomes  large,  Cpe  becomes  close  to 
Ce. 

IV.  Quantification  of  sum-capacity  suboptimality 
OF  EDMA 

A  simple  example  for  EDMA  to  be  subpotimal  is  ^  = 
^  and  uii  =  ±U2i.  Next,  we  decouple  the  singular  value 
and  vector  conditions,  and  use  Theorem  2  to  evaluate  their 
individual  impact  on  the  sum  capacity  achievability  of  EDMA. 

1 )  Singular  vector:  In  this  section  the  singular  value  con¬ 
ditions  are  assumed  to  be  satisfied,  but  the  subspaces  spanned 
by  the  corresponding  singular  vectors  are  now  different.  In 
practice  it  applies  to  the  low  SNR  case,  in  which  the  power 
for  both  users  is  allocated  only  to  the  largest  eigenmode  and 


Ur  log(Pi  -j-  P2  +  1), 
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Fig.  3.  Ratio  of  FDMA  sum  rate  and  ergodic  sum  capacity,  where  Ur  =  S,  4.  Example  6.  The  top  plot  is  the  ratio  of  Cp  and  C,  and  the  bottom 

=  nt2.  Pi  =  P2  =  angle  of  subspaces  versus  the  rotation  angle. 


k  =  ^ .  However,  the  singular  vector  conditions  in  Theorem 
2,  which  require  that  the  singular  vectors  of  Hi  and  H2 
expand  the  same  subspace,  are  seldom  satisfied  in  practice. 
If  the  singular  vector  subspaces  associated  to  an  and  a2i 
are  both  rank  1,  we  can  use  the  Euclidean  inner  product  to 
measure  the  angle  of  these  two  subspaces.  Otherwise,  in  order 
to  evaluate  the  relation  of  singular  vectors  and  we  must 
have  a  mechanism  that  allows  us  to  quantify  the  difference  of 
two  subspaces.  We  use  the  distance  between  two  subspaces, 
defined  in  [10,  page  76]  as 

dist(Z7i,Z72)  =  IIQi  -^2||A  (^max  (Qi  -/Q2) ,  (26) 

where  Ui,  i  =  1,2  are  the  subspaces,  is  the^  orthogonal 
projection  matrix  for  Ui,  i.e.,  U|,  where 

the  column  vectors  of  matrix  consist  of  the  orthonormal 
basis  of  the  subspace  Ui,  and  the  2-norm  of  Qi  —  JCI2  is  its 
largest  singular  value.  In  this  definition  Ui  and  U2  can  have 
different  ranks.  When  Ui  and  U2  have  the  same  rank,  we  can 
use  their  largest  principal  angle  (j)  to  quantify  their  distance, 
which  is  shown  to  be  [10] 

=  sin-Udist(Wi,W2)).  (27) 

With  (26)  and  (27)  we  can  calculate  (f)  and  ^  for  a  given 
MAC,  if  Ui  and  U2  are  two  n  x  /i  real  matrices.  However, 
in  order  to  have  a  complete  picture  of  the  relation  between 
<p  and  we  must  develop  a  mechanism  to  allow  (j)  to  vary 
continuously  from  0  to  2ti  to  access  its  effect  on  ^ .  Toward 
this  end,  we  introduce  the  idea  of  rotation.  The  unitary  matrix 
U2  can  be  obtained  by  rotating  Ui  along  an  axis  defined  by 
the  subspace  A  /)f  dimension  n—^  and  by  an  angle  0, 

U2  =  rot(Ui,^,  6>) , 

where  rot(-)  is  derived  in  the  Appendix.  In  the  rotation 
operation,  each  column  vectors  of  Ui  has  been  rotated  along 
A )5y  0,  the  principal  angles  of  singular  vector  subspaces  of  Ui 
and  U2  for  each  corresponding  eigenmode  can  be  different. 
The  relation  between  0  and  (/>  depends  on  the  choice  of  A.  By 
choosing  ^^nd  let  0  vary  in  [0,  27r],  different  U2  is  generated 
resulting  in  0  g/O,  2tt\.  This  mechanism  allows  us  to  quantify 
the  relation  of  ^  and  (j).  Here  is  an  example.  . 


Fig.  5.  The  ratio  of  and  C  versus  0,  the  principal  angle  of  the  subspaces 
for  Example  6. 


Example  6:  Ui  =  Vi  =  V2  =  I,  X)i  =  diag  (l,  1,  ^), 

iQ  0  0  ’ 

^111 

'1  1  0  0  ” 


,  A2 


S2  =  diag{2,2,\,A'^,  Ai  ■ 

T 

,  U2  =  rot(I,^,6>),  0  G/[0,27r],  Pi  = 


0  0  11 

p2  =  l. 

The  singular  value  conditions  are  satisfied  and  power  is  allo¬ 
cated  to  only  the  first  two  eigenmodes.  Signals  are  transmitted 
in  the  subspaces  Ui  spanned  by  [u^i,  Ui2],  where  u^i  and  Ui2 
are  unitary  and  orthogonal  vectors.  The  projection  matrix  for 

Qi  =  [Uil,U,2][Uii,U,2]^-  (28) 

From  (26)-(28),  the  angle  of  Ui  and  U2  is 
=  sin-1  ( 

^max  Uliufi  +  Ui2uf2  -A21U21  -^2^22})  ■ 


The  results  are  shown  in  Figs.  4  and  5.  While  different  choices 
of  the  rotation  axes  Ai  and  .4.2  result  in  different  curves  of 
0  and  (j),  the  curves  of  ^  coincide  and  are  monotonically 
decreasing  with  (j).  As  shown  in  Fig. 5,  ^  is  approximately 
linear  with  sin^  (j).  When  =  0,  27r,  if  .4i  is  the  rotation 
axis,  or  when  6>  =  0,  tt,  27r  if  .42  is  the  rotation  axis,  0  =  0, 
consequently  Ui  =  U2,  the  conditions  of  Theorem  2  are 
satisfied  and  ^  =  1.  This  is  when  the  mutual  interferences 
from  the  two  users  are  the  worst,  and  FDMA  benefits  the 
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TABLE  I 

Overlay  transmission. 


cr  (dB) 

user  1 

user  2 

C  (bit/Hz/s) 

<  0 

(0,1) 

(1,0) 

2 

=  0 

(a,  1  —  a) 

(1  —  a,  a) 

2 

>  0 

(1,0) 

(0,1) 

1  +  log(l  +  a^) 

TABLE  II 

FDMA  TRANSMISSION. 


a  (dB) 

user  1 

user  2 

C  (bit/Hz/s) 

<  -4.8 

fi  i) 

Vo’  2/ 

(1,0) 

1.79 

>  -4.8 

(-  A 

V2’  2) 

f  ao  1  1  — CKo  1  oco  1  — CKo  ^ 

V  2  2n-2  ^  _ 2 _ 2rT2  ) 

max  Cf(o) 

most  via  orthogonization.  Notice  that  in  this  case,  generally 
it  is  not  necessary  for  the  corresponding  subspaces  of  all  the 
eigenmode  to  coincide,  but  only  the  subspace  of  the  active 
eigenmodes  effective.  When  0  =  O.GGtt,  I.SStt  for  Ai,  and 
^  for  A2,  0  =  O.Stt,  Ui  ±  U2  and  ^  becomes  the 
minimum.  This  agrees  with  intuition:  the  orthogonality  of  the 
subspaces  allows  both  users  to  communicate  simultaneously 
at  maximum  rate  without  interfering  each  other.  Therefore 
overlay  transmission  outperforms  FDMA  since  the  latter 
unnecessarily  orthogonizes  transmission  while  the  effective 
channels  [uii,ui2]  and  [u2i,U22]  are  already  orthogonal. 

2)  Singular  value:  We  still  assume  Hi  and  H2  are  n  x  n 
real  matrices.  The  singular  vector  conditions  are  satisfied, 
but  the  singular  value  conditions  are  not.  Since  there  are  no 
general  matrices  to  represent  this  case  and  the  involvement 
of  power  conditions  as  in  Proposition  2,  there  is  no  uniform 
way  to  show  the  individual  effect  of  singular  value  conditions 
in  FDMA  sum  capacity  achievability.  Hence,  we  use  the 
following  example  and,  without  loss  of  generality,  we  assume 
Ui  =  U2  =  Vi  =  V2  =  I. 

rii 

Example  7.*  Hi  =  ^ 

cr  <  20dB,  Pi  =  P2  =  1. 

The  results  are  shown  in  Fig.  6,  the  sum  rate  and  optimal 
power  allocation  for  overlay  and  FDMA  are  shown  in  Table  I 
and  II.  For  overlay  transmission,  the  second  user  always  puts 
all  the  power  to  the  eigenmode  of  the  largest  eigenvalue,  while 
the  first  user  adaptively  puts  all  the  power  to  the  orthogonal 
direction.  For  FDMA,  the  optimal  frequency  allocation  is 

ao  =  0.48  when  a  </^l  +  i-^^)  ^  =  — 4.8dB,  and 
ao  =  argmaxQ,^[o,i]  Cf{ol)  when  a  >  — 4.8dB,  where 


,  H 


2  — 


If 

0 


0 

(T 


I 

,  -20dB  < 


Cf{ol)  = 


+  alog 


/  (JP2  1  + 
^  2a 


2 


Eig.  6.  C,  Cp  and  their  ratio  versus  cr  for  Example  7. 


from  OdB.  When  ct  — >  oo,  ao  ^  0  and 
lim  (C-Cf) 

a^oo 

=  lim  1  +  log cr^ -log 

cr^oo  \  Z  Z  / 

=  1. 

One  user’s  rate  becomes  dominant,  thus  FDMA  asymptotically 
achieves  the  sum  capacity  with  bandwidth  allocation  increas¬ 
ingly  favoring  the  dominant  user.  The  difference  between 
C  and  Cf  approaches  to  Ibit/Hz/s.  However,  ^  is  not 
monotone  in  a. 

V.  Conclusion  and  extension 

Orthogonal  transmission  in  vector  Gaussian  MAC  was 
studied  in  this  paper.  We  derived  sufficient  and  necessary 
conditions  for  FDMA  to  achieve  the  sum  capacity.  The  sum 
rate  degradation  of  orthogonal  transmission  was  quantified  by 
the  distance  of  singular  vector  subspaces  and  disproportional 
singular  values.  Parallel  results  for  Gaussian  vector  MAC  with 
more  than  two  users  can  also  be  similarly  obtained. 

Appendix  A 
Rotation  of  Subspace 

We  use  Fig. 7  as  an  example  to  derive  rot(Ui,  0).  A  is  a. 
rank  n  —  2  subspace  in  and  is  the  rotation  axis  depicted 
as  a  line  in  Fig. 7.  is  a  n  x  (n  —  2)  matrix,  whose  column 
vectors  consist  of  the  orthonormal  basis  of  the  subspace  A. 
Vector  OB  is  the  column  vector  of  Ui  and  is  denoted  as 
Ui.  Then  from  the  definition  of  OM  and  ME  we  have 

u'  4  OW  =  OM  +  We  +  'EB'.  (29) 

Define  Q  as  the  projection  matrix  of  the  subspace  A,  then 
Q  =  Aj_  •  A^.  With  this  we  have 

OM  =  Qu,  (30) 

We  =  {OB  -OM)cosO 

,  =  (ui  -  Qui)cos6>. 


So  when  a  =  OdB,  S/2  =  0.51,  o^o  =  0.5,  Of  =  C  =  2bit. 
In  the  neighborhood  of  OdB,  ^  decreases  as  a  moves  away 


(31) 
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Fig.  7.  Rotation  of  the  subspace  Ui  in  by  the  angle  9  along  the 
axis  A,  where  .AQan  be  any  subspaces  of  dimension  n  and  Ui  can 

be  any  subspaces  formed  by  some  column  vectors  of  matrix  Ui.  O  is  the 
origin  of  the  space,  vectors  OM  and  ON  are  in  A,  OB  and  OC  are 
two  linearly  independent  vectors  of  Ui  and  are  rotated  to  vector  OB'  and 
OC'  respectively.  The  projections  of  OB  and  OC  in  A@:t  OM  and  ON 
respectively.  9  =  Z.CNC'  =  ZB  MB'  is  the  rotation  angle.  ME  is  the 
projection  of  MB'  on  MB. 


Since  S /  MB ^  MB'  ±A,  then  EB'NS /  MB ^  A  /.  Define  a 
nz  fi  matrix 


W  = 


(Ui 


—  cos6> 

AT 


(32) 


where  is  a  1  x  /i  row  vector  with  all  elements  equal  to  1. 
Define  the  adjugate  of  W  as  W*  and  the  first  column  vector  of 
W*  as  wi,  then  mviNMB  and  wiNAj^ince  WW*  =  |W|I. 
With  this,  we  have 


_  _  EB' 

EB'  =  EB'  •  _ 

EB' 

Wi 

=  l|ui -/r^-  (33) 
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U-  =  PUi  +  (ui  -  ^Ui)  cos  0 

+  ||ui -^Ui||^in6»  ■ (34) 

Then  we  have  U2  =  rot {Ui, as  u',i  = 
1,  •  •  •  /n  defined  in  (34). 

In  Fig. 7,  although  the  rotation  angle  is  0,  depending  on  the 
choice  of  A,  the  angle  of  the  two  vectors  OB  and  OB'  and 
the  principle  angle  (j)  of  the  two  subspaces  S/  OB^  OC  and 
S /  OB' ,  OC  can  be  different  from  6>. 
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